Dp04 polymerase variants

ABSTRACT

Recombinant DPO4-type DNA polymerase variants with amino acid substitutions that confer modified properties upon the polymerase for improved single molecule sequencing applications are provided. Such properties may include enhanced binding and incorporation of bulky nucleotide analog substrates into daughter strands and the like. Also provided are compositions comprising such DPO4 variants and nucleotide analogs, as well as nucleic acids which encode the polymerases with the aforementioned phenotypes.

STATEMENT REGARDING SEQUENCE LISTING

The Sequence Listing associated with this application is provided intext format in lieu of a paper copy, and is hereby incorporated byreference into the specification. The name of the text file containingthe Sequence Listing is 870225_415WO_SEQUENCE_LISTING.txt. The text fileis 344 KB, was created on Nov. 10, 2016, and is being submittedelectronically via EFS-Web.FIELD OF THE INVENTION

The disclosure relates generally to polymerase compositions and methods.More particularly, the disclosure relates to modified DPO4 polymerasesand their use in biological applications including, for example,nucleotide analogue incorporation, primer extension and single moleculesequencing reactions.

BACKGROUND OF THE INVENTION

DNA polymerases replicate the genomes of living organisms. In additionto this central role in biology, DNA polymerases are also ubiquitoustools of biotechnology. They are widely used, e.g., for reversetranscription, amplification, labeling, and sequencing, all centraltechnologies for a variety of applications, such as nucleic acidsequencing, nucleic acid amplification, cloning, protein engineering,diagnostics, molecular medicine, and many other technologies.

Because of their significance, DNA polymerases have been extensivelystudied, with a focus, e.g., on phylogenetic relationships amongpolymerases, structure of polymerases, structure-function features ofpolymerases, and the role of polymerases in DNA replication and otherbasic biological processes, as well as ways of using DNA polymerases inbiotechnology. For a review of polymerases, see, e.g., Hubscher et al.(2002) “Eukaryotic DNA Polymerases” Annual Review of Biochemistry Vol.71: 133-163, Alba (2001) “Protein Family Review: Replicative DNAPolymerases” Genome Biology 2(1): reviews 3002.1-3002.4, Steitz (1999)“DNA polymerases: structural diversity and common mechanisms” J BiolChem 274:17395-17398, and Burgers et al. (2001) “Eukaryotic DNApolymerases: proposal for a revised nomenclature” J Biol. Chem. 276(47):43487-90. Crystal structures have been solved for many polymerases,which often share a similar architecture. The basic mechanisms of actionfor many polymerases have been determined.

A fundamental application of DNA polymerases is in DNA sequencingtechnologies. From the classical Sanger sequencing method to recent“next-generation” sequencing (NGS) technologies, the nucleotidesubstrates used for sequencing have necessarily changed over time. Theseries of nucleotide modifications required by these rapidly changingtechnologies has introduced daunting tasks for DNA polymeraseresearchers to look for, design, or evolve compatible enzymes forever-changing DNA sequencing chemistries. DNA polymerase mutants havebeen identified that have a variety of useful properties, includingaltered nucleotide analog incorporation abilities relative to wild-typecounterpart enzymes. For example, Vent^(A488L) DNA polymerase canincorporate certain non-standard nucleotides with a higher efficiencythan native Vent DNA polymerase. See Gardner et al. (2004) “ComparativeKinetics of Nucleotide Analog Incorporation by Vent DNA Polymerase” J.Biol. Chem. 279(12):11834-11842 and Gardner and Jack (1999)“Determinants of nucleotide sugar recognition in an archaeon DNApolymerase” Nucleic Acids Research 27(12):2545-2553. The altered residuein this mutant, A488, is predicted to be facing away from the nucleotidebinding site of the enzyme. The pattern of relaxed specificity at thisposition roughly correlates with the size of the substituted amino acidside chain and affects incorporation by the enzyme of a variety ofmodified nucleotide sugars.

More recently, NGS technologies have introduced the need to adapt DNApolymerase enzymes to accept nucleotide substrates modified withreversible terminators on the 3′-OH, such as —ONH₂. To this end, Chenand colleagues combined structural analyses with a “reconstructedevolutionary adaptive path” analysis to generate a TAQ^(L616A) variantthat is able to efficiently incorporate both reversible and irreversibleterminators. See Chen et al. (2010) “Reconstructed Evolutionary AdaptivePaths Give Polymerases Accepting Reversible Terminators for Sequencingand SNP Detection” Proc. Nat. Acad. Sci. 107(5):1948-1953. Modelingstudies suggested that this variant might open space behind Phe-667,allowing it to accommodate the larger 3′ substituents. U.S. Pat. No.8,999,676 to Emig et al. discloses additional modified polymerases thatdisplay improved properties useful for single molecule sequencingtechnologies based on fluorescent detection. In particular, substitutionof φ29 DNA polymerase at positions E375 and K512 was found to enhancethe ability of the polymerase to utilize non-natural, phosphate-labelednucleotide analogs incorporating different fluorescent dyes.

Recently, Kokoris et al. have described a method, termed “sequencing byexpansion” (SBX), that uses a DNA polymerase to transcribe the sequenceof DNA onto a measurable polymer called an Xpandomer (see, e.g., U.S.Pat. No. 8,324,360 to Kokoris et al.). The transcribed sequence isencoded along the Xpandomer backbone in high signal-to-noise reportersthat are separated by ˜10 nm and are designed for high signal-to-noise,well differentiated responses when read by nanopore-based sequencingsystems. Xpandomers are generated from non-natural nucleotide analogs,termed XNTPs, characterized by bulky substituents that enable theXpandomer backbone to be expanded following synthesis. Such XNTP analogsintroduce novel challenges as substrates for currently available DNApolymerases.

Thus, new modified polymerases, e.g., modified polymerases that displayimproved properties useful for nanopore-based sequencing and otherpolymerase applications (e.g., DNA amplification, sequencing, labeling,detection, cloning, etc.), would find value in the art. The presentinvention provides new recombinant DNA polymerases with desirableproperties, including the ability to incorporate nucleotide analogs withbulky substitutions with improved efficiency. Also provided are methodsof making and using such polymerases, and many other features that willbecome apparent upon a complete review of the following.

SUMMARY

Recombinant DNA polymerases and modified DNA polymerases, e.g. modifiedDPO4, can find use in such applications as, e.g., single-moleculesequencing by expansion (SBX). Among other aspects, the inventionprovides recombinant DNA polymerases and modified DNA polymerasevariants comprising mutations that confer properties, which can beparticularly desirable for these applications. These properties can,e.g., improve the ability of the polymerase to utilize bulky nucleotideanalogs as substrates during template-dependent polymerization of adaughter strand. Also provided are compositions comprising such DNApolymerases and modified DPO4-type polymerases, nucleic acids encodingsuch modified polymerases, methods of generating such modifiedpolymerases and methods in which such polymerases can be used, e.g., tosequence a DNA template.

One general class of embodiments provides a recombinant DPO4-type DNApolymerase that is at least 90% identical to SEQ ID NO:1 and has atleast one mutation at a position selected from the group consisting ofamino acids 76, 78, 79, 82, 83, and 86, in which identification ofpositions is relative to wildtype DPO4 polymerase (SEQ ID NO:1), and inwhich the recombinant DNA polymerase exhibits polymerase activity.Exemplary mutations at positions 76, 78, 79, 82, 83, and 86 includeM76H, M76W, M76V, M76A, M76S, M76L, M76T, M76C, M76F, M76Q, K78P, K78N,K78Q, K78T, K78L, K78V, K78S, K78F, K78E, K78M, K78A, K78I, K78H, K78Y,K78G, E79L, E79M, E79W, E79V, E79N, E79Y, E79G, E79S, E79H, E79A, E79R,E79T, E79P, E79D, E79F, Q82Y, Q82W, Q82N, Q82S, Q82H, Q82D, Q82E, Q82G,Q82M, Q82R, Q82K, Q82V, Q82T, Q83G, Q83R, Q83S, Q83T, Q83I, Q83E, Q83M,Q83D, Q83A, Q83K, Q83H, S86E, S86L, S86W, S86K, S86N, S86Q, S86V, S86M,S86T, S86G, S86R, S86A, and S86D. In other embodiments, the recombinantDPO4-type DNA polymerase is represented by the amino acid sequence asset forth in any one of SEQ ID NOs: 2-46.

In a related aspect, the invention provides compositions containing anyof the recombinant DPO4-type DNA polymerase set forth above. In certainembodiments, the compositions may also contain at least one non-naturalnucleotide analog substrate.

In another related aspect, the invention provides modified nucleic acidsencoding any of the modified DPO4-type DNA polymerase set forth above.

Another general class of embodiments provides a recombinant DPO4-typeDNA polymerase that is at least 90% identical to SEQ ID NO:1 and hasmutations at positions 76, 78, 79, 82, 83, and 86 and at least oneadditional mutation at a position selected from the group consisting ofamino acids 5, 42, 56, 62, 66, 141, 150, 152, 153, 155, 156, 184, 187,189, 190, 212, 214, 215, 217, 221, 226, 240, 241, 248, 289, 290, 291,292, 293, 300, and 326, in which identification of positions is relativeto wildtype DPO4 polymerase (SEQ ID NO:1), and in which the recombinantDNA polymerase exhibits polymerase activity. In some embodiments,exemplary mutations at positions 76, 78, 79, 82, 83, and 86 includeM76W, K78N, E79L, Q82W, Q82Y, Q83G, and S86E. In other embodiments,exemplary mutations at positions 5, 42, 56, 62, 66, 141, 150, 152, 153,155, 156, 184, 187, 189, 190, 212, 214, 215, 217, 221, 226, 240, 241,248, 289, 290, 291, 292, 293, 300, and 326 include F5Y, A42V, V62R,K66R, T141S, F150L, K152A, K152G, K152M, K152P, I153F, I153Q, 1153W,A155L, A155M, A155N, A155V, A155G, D156Y, D156W, P184L, G187W, G187D,G187E, 1189W, T190Y, T190D, T190E, K212V, K212L, K212A, K214S, G215F,I217V, K221D, K221E, K221Q, I226F, R240S, R240T, V241N, V241R, I248A,I248T, V289W, T290K, E291S, D292Y, L293F, L293W, R300E, R300V, andD326E. In other embodiments, the recombinant DPO4-type DNA polymerase isrepresented by the amino acid sequence as set forth in any one of SEQ IDNOs: 47-115.

In yet another embodiment, the recombinant DPO4-type polymerase furtherincludes a deletion to remove the terminal 12 amino acids (i.e., the PIPbox region) of the protein.

In a related aspect, the invention provides compositions containing anyof the recombinant DNA polymerases or DPO4-type DNA polymerase set forthabove. In certain embodiments, the compositions may also contain atleast one non-natural nucleotide analog substrate.

In another related aspect, the invention provides modified nucleic acidsencoding any of the recombinant DNA polymerases or modified DPO4-typeDNA polymerase set forth above.

Another general class of embodiments provides an isolated recombinantDNA polymerase in which the recombinant DNA polymerase is capable ofsynthesizing nucleic acid daughter strands using nucleotide analogsubstrates having the following structure:

in which T represents a tether; N represents a nucleobase residue; Vrepresents an internal cleavage site of the nucleobase residue; and R¹and R² represent the same or different end groups for the templatedirected synthesis of the daughter strand. In some embodiments, therecombinant DNA polymerase is a class Y DNA polymerase or a variant of aclass Y DNA polymerase. In other embodiments, the recombinant DNApolymerase is DPO4 or Dbh or a variant of DPO4 or Dbh. In yet otherembodiments, the recombinant DNA polymerase has a deletion to remove thePIP box region of the protein. In other embodiments, the deletionremoves the terminal 12 amino acids of the protein.

BRIEF DESCRIPTION OF THE FIGURES

FIG. 1 shows the amino acid sequence of the DPO4 polymerase protein (SEQID NO: 1) with the Mut_1 through Mut_13 regions outlined and variableamino acids underscored.

DEFINITIONS

Unless defined otherwise, all technical and scientific terms used hereinhave the same meaning as commonly understood by one of ordinary skill inthe art to which the invention pertains. The following definitionssupplement those in the art and are directed to the current applicationand are not to be imputed to any related or unrelated case, e.g., to anycommonly owned patent or application. Although any methods and materialssimilar or equivalent to those described herein can be used in thepractice for testing of the present invention, the preferred materialsand methods are described herein. Accordingly, the terminology usedherein is for the purpose of describing particular embodiments only, andis not intended to be limiting.

As used in this specification and the appended claims, the singularforms “a,” “an” and “the” include plural referents unless the contextclearly dictates otherwise. Thus, for example, reference to “a protein”includes a plurality of proteins; reference to “a cell” includesmixtures of cells, and the like.

The term “about” as used herein indicates the value of a given quantityvaries by +/−10% of the value, or optionally+/−5% of the value, or insome embodiments, by +/−1% of the value so described.

“Nucleobase” is a heterocyclic base such as adenine, guanine, cytosine,thymine, uracil, inosine, xanthine, hypoxanthine, or a heterocyclicderivative, analog, or tautomer thereof. A nucleobase can be naturallyoccurring or synthetic. Non-limiting examples of nucleobases areadenine, guanine, thymine, cytosine, uracil, xanthine, hypoxanthine,8-azapurine, purines substituted at the 8 position with methyl orbromine, 9-oxo-N6-methyladenine, 2-aminoadenine, 7-deazaxanthine,7-deazaguanine, 7-deaza-adenine, N4-ethanocytosine, 2,6-diaminopurine,N6-ethano-2,6-diaminopurine, 5-methylcytosine,5-(C3-C6)-alkynylcytosine, 5-fluorouracil, 5-bromouracil, thiouracil,pseudoi socytosine, 2-hydroxy-5-methyl-4-triazolopyridine, i socytosine,i soguanine, inosine, 7,8-dimethylalloxazine, 6-dihydrothymine,5,6-dihydrouracil, 4-methyl-indole, ethenoadenine and the non-naturallyoccurring nucleobases described in U.S. Pat. Nos. 5,432,272 and6,150,510 and PCT Publication Nos. WO 92/002258, WO 93/10820, WO94/22892, and WO 94/24144, and Fasman (“Practical Handbook ofBiochemistry and Molecular Biology”, pp. 385-394, 1989, CRC Press, BocaRaton, La.), all herein incorporated by reference in their entireties.

“Nucleobase residue” includes nucleotides, nucleosides, fragmentsthereof, and related molecules having the property of binding to acomplementary nucleotide. Deoxynucleotides and ribonucleotides, andtheir various analogs, are contemplated within the scope of thisdefinition. Nucleobase residues may be members of oligomers and probes.“Nucleobase” and “nucleobase residue” may be used interchangeably hereinand are generally synonymous unless context dictates otherwise.

“Polynucleotides”, also called nucleic acids, are covalently linkedseries of nucleotides in which the 3′ position of the pentose of onenucleotide is joined by a phosphodiester group to the 5′ position of thenext. DNA (deoxyribonucleic acid) and RNA (ribonucleic acid) arebiologically occurring polynucleotides in which the nucleotide residuesare linked in a specific sequence by phosphodiester linkages. As usedherein, the terms “polynucleotide” or “oligonucleotide” encompass anypolymer compound having a linear backbone of nucleotides.Oligonucleotides, also termed oligomers, are generally shorter chainedpolynucleotides.

“Nucleic acid” is a polynucleotide or an oligonucleotide. A nucleic acidmolecule can be deoxyribonucleic acid (DNA), ribonucleic acid (RNA), ora combination of both. Nucleic acids are generally referred to as“target nucleic acids” or “target sequence” if targeted for sequencing.Nucleic acids can be mixtures or pools of molecules targeted forsequencing.

A “polynucleotide sequence” or “nucleotide sequence” is a polymer ofnucleotides (an oligonucleotide, a DNA, a nucleic acid, etc.) or acharacter string representing a nucleotide polymer, depending oncontext. From any specified polynucleotide sequence, either the givennucleic acid or the complementary polynucleotide sequence (e.g., thecomplementary nucleic acid) can be determined.

A “polypeptide” is a polymer comprising two or more amino acid residues(e.g., a peptide or a protein). The polymer can additionally comprisenon-amino acid elements such as labels, quenchers, blocking groups, orthe like and can optionally comprise modifications such as glycosylationor the like. The amino acid residues of the polypeptide can be naturalor non-natural and can be unsubstituted, unmodified, substituted ormodified.

An “amino acid sequence” is a polymer of amino acid residues (a protein,polypeptide, etc.) or a character string representing an amino acidpolymer, depending on context.

Numbering of a given amino acid or nucleotide polymer “corresponds tonumbering of” or is “relative to” a selected amino acid polymer ornucleic acid when the position of any given polymer component (aminoacid residue, incorporated nucleotide, etc.) is designated by referenceto the same residue position in the selected amino acid or nucleotidepolymer, rather than by the actual position of the component in thegiven polymer. Similarly, identification of a given position within agiven amino acid or nucleotide polymer is “relative to” a selected aminoacid or nucleotide polymer when the position of any given polymercomponent (amino acid residue, incorporated nucleotide, etc.) isdesignated by reference to the residue name and position in the selectedamino acid or nucleotide polymer, rather than by the actual name andposition of the component in the given polymer. Correspondence ofpositions is typically determined by aligning the relevant amino acid orpolynucleotide sequences.

The term “recombinant” indicates that the material (e.g., a nucleic acidor a protein) has been artificially or synthetically (non-naturally)altered by human intervention. The alteration can be performed on thematerial within, or removed from, its natural environment or state. Forexample, a “recombinant nucleic acid” is one that is made by recombiningnucleic acids, e.g., during cloning, DNA shuffling or other procedures,or by chemical or other mutagenesis; a “recombinant polypeptide” or“recombinant protein” is, e.g., a polypeptide or protein which isproduced by expression of a recombinant nucleic acid.

A “DPO4-type DNA polymerase” is a DNA polymerase naturally expressed bythe archaea, Sulfolobus solfataricus, or a related Y-family DNApolymerase, which generally function in the replication of damaged DNAby a process known as translesion synthesis (TLS). Y-family DNApolymerases are homologous to the DPO4 polymerase (e.g., as listed inSEQ ID NO:1); examples include the prokaryotic enzymes, PolII, PolIV,PolV, the archaeal enzyme, Dbh, and the eukaryotic enzymes, Rev3p,Rev1p, Pol η, REV3, REV1, Pol I, and Pol κ DNA polymerases, as well aschimeras thereof. A modified recombinant DPO4-type DNA polymeraseincludes one or more mutations relative to naturally-occurring wild-typeDPO4-type DNA polymerases, for example, one or more mutations thatincrease the ability to utilize bulky nucleotide analogs as substratesor another polymerase property, and may include additional alterationsor modifications over the wild-type DPO4-type DNA polymerase, such asone or more deletions, insertions, and/or fusions of additional peptideor protein sequences (e.g., for immobilizing the polymerase on a surfaceor otherwise tagging the polymerase enzyme).

“Template-directed synthesis”, “template-directed assembly”,“template-directed hybridization”, “template-directed binding” and anyother template-directed processes, e.g., primer extension, refers to aprocess whereby nucleotide residues or nucleotide analogs bindselectively to a complementary target nucleic acid, and are incorporatedinto a nascent daughter strand. A daughter strand produced by atemplate-directed synthesis is complementary to the single-strandedtarget from which it is synthesized. It should be noted that thecorresponding sequence of a target strand can be inferred from thesequence of its daughter strand, if that is known. “Template-directedpolymerization” is a special case of template-directed synthesis wherebythe resulting daughter strand is polymerized.

“XNTP” is an expandable, 5′ triphosphate modified nucleotide substratecompatible with template dependent enzymatic polymerization. An XNTP hastwo distinct functional components; namely, a nucleobase 5′-triphosphateand a tether or tether precursor that is attached within each nucleotideat positions that allow for controlled RT expansion by intra-nucleotidecleavage.

“Xpandomer intermediate” is an intermediate product (also referred toherein as a “daughter strand”) assembled from XNTPs, and is formed by atemplate-directed assembly of XNTPs using a target nucleic acidtemplate. The Xpandomer intermediate contains two structures; namely,the constrained Xpandomer and the primary backbone. The constrainedXpandomer comprises all of the tethers in the daughter strand but maycomprise all, a portion or none of the nucleobase 5′-triphosphates asrequired by the method. The primary backbone comprises all of theabutted nucleobase 5′-triphosphates. Under the process step in which theprimary backbone is fragmented or dissociated, the constrained Xpandomeris no longer constrained and is the Xpandomer product which is extendedas the tethers are stretched out. “Duplex daughter strand” refers to anXpandomer intermediate that is hybridized or duplexed to the targettemplate.

“Xpandomer” or “Xpandomer product” is a synthetic molecular constructproduced by expansion of a constrained Xpandomer, which is itselfsynthesized by template-directed assembly of XNTPs. The Xpandomer iselongated relative to the target template it was produced from. It iscomposed of a concatenation of XNTPs, each XNTP including a tethercomprising one or more reporters encoding sequence information. TheXpandomer is designed to expand to be longer than the target templatethereby lowering the linear density of the sequence information of thetarget template along its length. In addition, the Xpandomer optionallyprovides a platform for increasing the size and abundance of reporterswhich in turn improves signal to noise for detection. Lower linearinformation density and stronger signals increase the resolution andreduce sensitivity requirements to detect and decode the sequence of thetemplate strand.

“Tether” or “tether member” refers to a polymer or molecular constructhaving a generally linear dimension and with an end moiety at each oftwo opposing ends. A tether is attached to a nucleobase 5′-triphosphatewith a linkage in at least one end moiety to form an XNTP. The endmoieties of the tether may be connected to cleavable linkages to thenucleobase 5′-triphosphate that serve to constrain the tether in a“constrained configuration”. After the daughter strand is synthesized,each end moiety has an end linkage that couples directly or indirectlyto other tethers. The coupled tethers comprise the constrained Xpandomerthat further comprises the daughter strand. Tethers have a “constrainedconfiguration” and an “expanded configuration”. The constrainedconfiguration is found in XNTPs and in the daughter strand. Theconstrained configuration of the tether is the precursor to the expandedconfiguration, as found in Xpandomer products. The transition from theconstrained configuration to the expanded configuration results cleavingof selectively cleavable bonds that may be within the primary backboneof the daughter strand or intra-tether linkages. A tether in aconstrained configuration is also used where a tether is added to formthe daughter strand after assembly of the “primary backbone”. Tetherscan optionally comprise one or more reporters or reporter constructsalong its length that can encode sequence information of substrates. Thetether provides a means to expand the length of the Xpandomer andthereby lower the sequence information linear density.

“Tether element” or “tether segment” is a polymer having a generallylinear dimension with two terminal ends, where the ends formend-linkages for concatenating the tether elements. Tether elements maybe segments of tether constructs. Such polymers can include, but are notlimited to: polyethylene glycols, polyglycols, polypyridines,polyisocyanides, polyisocyanates, poly(triarylmethyl)methacrylates,polyaldehydes, polypyrrolinones, polyureas, polyglycol phosphodiesters,polyacrylates, polymethacrylates, polyacrylamides, polyvinyl esters,polystyrenes, polyamides, polyurethanes, polycarbonates, polybutyrates,polybutadienes, polybutyrolactones, polypyrrolidinones,polyvinylphosphonates, polyacetami des, polysaccharides,polyhyaluranates, polyamides, polyimides, polyesters, polyethylenes,polypropylenes, polystyrenes, polycarbonates, polyterephthalates,polysilanes, polyurethanes, polyethers, polyamino acids, polyglycines,polyprolines, N-substituted polylysine, polypeptides, side-chainN-substituted peptides, poly-N-substituted glycine, peptoids, side-chaincarboxyl-substituted peptides, homopeptides, oligonucleotides,ribonucleic acid oligonucleotides, deoxynucleic acid oligonucleotides,oligonucleotides modified to prevent Watson-Crick base pairing,oligonucleotide analogs, polycytidylic acid, polyadenylic acid,polyuridylic acid, polythymidine, polyphosphate, polynucleotides,polyribonucleotides, polyethylene glycol-phosphodiesters, peptidepolynucleotide analogues, threosyl-polynucleotide analogues,glycol-polynucleotide analogues, morpholino-polynucleotide analogues,locked nucleotide oligomer analogues, polypeptide analogues, branchedpolymers, comb polymers, star polymers, dendritic polymers, random,gradient and block copolymers, anionic polymers, cationic polymers,polymers forming stem-loops, rigid segments and flexible segments.

A variety of additional terms are defined or otherwise characterizedherein.

DETAILED DESCRIPTION

One aspect of the invention is generally directed to compositionscomprising a recombinant polymerase, e.g., a recombinant DPO4-type DNApolymerase that includes one or more mutations as compared to areference polymerase, e.g., a wildtype DPO4-type polymerase. Dependingon the particular mutation or combination of mutations, the polymeraseexhibits one or more properties that find use, e.g., in single moleculesequencing applications. Exemplary properties exhibited by variouspolymerases of the invention include the ability to incorporate “bulky”nucleotide analogs into a growing daughter strand during DNAreplication. The polymerases can include one or more exogenous orheterologous features at the N- and/or C-terminal regions of the proteinfor use, e.g., in the purification of the recombinant polymerase. Thepolymerases can also include one or more deletions that facilitatepurification of the protein, e.g., by increasing the solubility ofrecombinantly produced protein.

These new polymerases are particularly well suited to DNA replicationand/or sequencing applications, particularly sequencing protocols thatinclude incorporation of bulky nucleotide analogs into a replicatednucleic acid daughter strand, such as in the sequencing by expansion(SBX) protocol, as further described below.

Polymerases of the invention include, for example, a recombinantDPO4-type DNA polymerase that comprises a mutation at one or morepositions selected from the group consisting of M76, K78, E79, Q82, Q83,and S86, wherein identification of positions is relative to wild-typeDPO4 polymerase (SEQ ID NO:1). Optionally, the polymerase comprisesmutations at two or more, three or more, four or more, five or more, sixor more, up to ten or more, up to 20 or more, or from 20 to 30 or moreof these positions. A number of exemplary substitutions at these (andother) positions are described herein.

DNA Polymerases

DNA polymerases that can be modified to increase the ability toincorporate bulky nucleotide analog substrates into a growing daughternucleic acid strand and/or other desirable properties as describedherein are generally available. DNA polymerases are sometimes classifiedinto six main groups, or families, based upon various phylogeneticrelationships, e.g., with E. coli Pol I (class A), E. coli Pol II (classB), E. coli Pol III (class C), Euryarchaeotic Pol II (class D), humanPol beta (class X), and E. coli UmuC/DinB and eukaryotic RAD30/xerodermapigmentosum variant (class Y). For a review of recent nomenclature, see,e.g., Burgers et al. (2001) “Eukaryotic DNA polymerases: proposal for arevised nomenclature” J Biol. Chem. 276(47):43487-90. For a review ofpolymerases, see, e.g., Hubscher et al. (2002) “Eukaryotic DNAPolymerases” Annual Review of Biochemistry Vol. 71: 133-163; Alba (2001)“Protein Family Review: Replicative DNA Polymerases” Genome Biology2(1): reviews 3002.1-3002.4; and Steitz (1999) “DNA polymerases:structural diversity and common mechanisms” J Biol Chem 274:17395-17398.DNA polymerase have been extensively studied and the basic mechanisms ofaction for many have been determined. In addition, the sequences ofliterally hundreds of polymerases are publicly available, and thecrystal structures for many of these have been determined or can beinferred based upon similarity to solved crystal structures forhomologous polymerases. For example, the crystal structure of DPO4, apreferred type of parental enzyme to be modified according to thepresent invention, is available see, e.g., Ling et al. (2001) “CrystalStructure of a Y-Family DNA Polymerase in Action: A Mechanism forError-Prone and Lesion-Bypass Replication” Cell 107:91-102.

DNA polymerases that are preferred substrates for mutation to increasethe use of bulky nucleotide analog as substrates for incorporation intogrowing nucleic acid daughter strands, and/or to alter one or more otherproperty described herein include DPO4 polymerases and other members ofthe Y family of translesional DNA polymerases, such as Dbh, andderivatives of such polymerases.

In one aspect, the polymerase that is modified is a DPO4-type DNApolymerase. For example, the modified recombinant DNA polymerase can behomologous to a wildtype DPO4 DNA polymerase. Alternately, the modifiedrecombinant DNA polymerase can be homologous to other Class Y DNApolymerases, also known as “translesion” DNA polymerases, such asSulfolobus acidocaldarius Dbh polymerase. For a review, see Goodwin andWoodgate (2013) “Translesion DNA Polymerases” Cold Spring Harb Perspectin Biol doi:10.1101/cshperspect.a010363. See, e.g., SEQ ID NO:1 for theamino acid sequence of wildtype DPO4 polymerase.

Many polymerases that are suitable for modification, e.g., for use insequencing technologies, are commercially available. For example, DPO4polymerase is available from TREVEGAN® and New England Biolabs®.

In addition to wildtype polymerases, chimeric polymerases made from amosaic of different sources can be used. For example, DPO4-typepolymerases made by taking sequences from more than one parentalpolymerase into account can be used as a starting point for mutation toproduce the polymerases of the invention. Chimeras can be produced,e.g., using consideration of similarity regions between the polymerasesto define consensus sequences that are used in the chimera, or usinggene shuffling technologies in which multiple DPO4-related polymerasesare randomly or semi-randomly shuffled via available gene shufflingtechniques (e.g., via “family gene shuffling”; see Crameri et al. (1998)“DNA shuffling of a family of genes from diverse species acceleratesdirected evolution” Nature 391:288-291; Clackson et al. (1991) “Makingantibody fragments using phage display libraries” Nature 352:624-628;Gibbs et al. (2001) “Degenerate oligonucleotide gene shuffling (DOGS): amethod for enhancing the frequency of recombination with familyshuffling” Gene 271:13-20; and Hiraga and Arnold (2003) “General methodfor sequence-independent site-directed chimeragenesis: J. Mol. Biol.330:287-296). In these methods, the recombination points can bepredetermined such that the gene fragments assemble in the correctorder. However, the combinations, e.g., chimeras, can be formed atrandom. Appropriate mutations to improve incorporation of bulkynucleotide analog substrates or another desirable property can beintroduced into the chimeras.

Nucleotide Analogs

As discussed, various polymerases of the invention can incorporate oneor more nucleotide analogs into a growing oligonucleotide chain. Uponincorporation, the analog can leave a residue that is the same as ordifferent than a natural nucleotide in the growing oligonucleotide (thepolymerase can incorporate any non-standard moiety of the analog, or cancleave it off during incorporation into the oligonucleotide). A“nucleotide analog” herein is a compound, that, in a particularapplication, functions in a manner similar or analogous to a naturallyoccurring nucleoside triphosphate (a “nucleotide”), and does nototherwise denote any particular structure. A nucleotide analog is ananalog other than a standard naturally occurring nucleotide, i.e., otherthan A, G, C, T, or U, though upon incorporation into theoligonucleotide, the resulting residue in the oligonucleotide can be thesame as (or different from) an A, G, C, T, or U residue.

Many nucleotide analogs are available and can be incorporated by thepolymerases of the invention. These include analog structures with coresimilarity to naturally occurring nucleotides, such as those thatcomprise one or more substituent on a phosphate, sugar, or base moietyof the nucleoside or nucleotide relative to a naturally occurringnucleoside or nucleotide.

In one useful aspect of the invention, nucleotide analogs can also bemodified to achieve any of the improved properties desired. For example,various tethers, linkers, or other substituents can be incorporated intoanalogs to create a “bulky” nucleotide analog, wherein the term “bulky”is understood to mean that the size of the analog is substantiallylarger than a natural nucleotide, while not denoting any particulardimension. For example, the analog can include a substituted compound(i.e., a “XNTP”, as disclosed in U.S. Pat. No. 7,939,259 and PCTPublication No. WO 2016/081871 to Kokoris et al.) of the formula:

As shown in the above formula, the monomeric XNTP construct has anucleobase residue, N, that has two moieties separated by a selectivelycleavable bond (V), each moiety attaching to one end of a tether (T).The tether ends can attach to the linker group modifications on theheterocycle, the ribose group, or the phosphate backbone. The monomersubstrate also has an intra-substrate cleavage site positioned withinthe phosphororibosyl backbone such that cleavage will provide expansionof the constrained tether. For example, to synthesize a XATP monomer,the amino linker on 8-[(6-Amino)hexyl]-amino-ATP orN6-(6-Amino)hexyl-ATP can be used as a first tether attachment point,and, a mixed backbone linker, such as the non-bridging modification(N-1-aminoalkyl) phosphoramidate or (2-aminoethyl) phosphonate, can beused as a second tether attachment point. Further, a bridging backbonemodification such as a phosphoramidate (3′ O—P—N 5′) or aphosphorothiolate (3′ O—P—S 5′), for example, can be used for selectivechemical cleavage of the primary backbone. R¹ and R² are end groupsconfigured as appropriate for the synthesis protocol in which thesubstrate construct is used. For example, R¹=5′-triphosphate andR²=3′-OH for a polymerase protocol. The R¹ 5′ triphosphate may includemixed backbone modifications, such as an aminoethyl phosphonate or3′-O—P—S-5′ phosphorothiolate, to enable tether linkage and backbonecleavage, respectively. Optionally, R² can be configured with areversible blocking group for cyclical single-substrate addition.Alternatively, R¹ and R² can be configured with linker end groups forchemical coupling. R¹ and R² can be of the general type XR, wherein X isa linking group and R is a functional group. Detailed atomic structuresof suitable substrates for polymerase variants of the present inventionmay be found, e.g., in Vaghefi, M. (2005) “Nucleoside Triphosphates andtheir Analogs” CRC Press Taylor & Francis Group.

Applications for Increased Abilities to Incorporate Bulky NucleotideAnalog Substrates

Polymerases of the invention, e.g., modified recombinant polymerases, orvariants, may be used in combination with nucleotides and/or nucleotideanalogs and nucleic acid templates (DNA or RNA) to copy the templatenucleic acid. That is, a mixture of the polymerase, nucleotides/analogs,and optionally other appropriate reagents, the template and areplication initiating moiety (e.g., primer) is reacted such that thepolymerase synthesizes a daughter nucleic acid strand (e.g., extends theprimer) in a template-dependent manner. The replication initiatingmoiety can be a standard oligonucleotide primer, or, alternatively, acomponent of the template, e.g., the template can be a self-primingsingle stranded DNA, a nicked double stranded DNA, or the like.Similarly, a terminal protein can serve as an initiating moiety. Atleast one nucleotide analog can be incorporated into the DNA. Thetemplate DNA can be a linear or circular DNA, and in certainapplications, is desirably a circular template (e.g., for rolling circlereplication or for sequencing of circular templates). Optionally, thecomposition can be present in an automated DNA replication and/orsequencing system.

In one embodiment, the daughter nucleic acid strand is an Xpandomerintermediate comprised of XNTPs, as disclosed in U.S. Pat. No.7,939,259, and PCT Publication No. WO 2016/081871 to Kokoris et al. andassigned to Stratos Genomics, which are herein incorporated by referencein its entirety. Stratos Genomics has developed a method calledSequencing by Expansion (“SBX”) that uses a DNA polymerase to transcribethe sequence of DNA onto a measurable polymer called an “Xpandomer”. Ingeneral terms, an Xpandomer encodes (parses) the nucleotide sequencedata of the target nucleic acid in a linearly expanded format, therebyimproving spatial resolution, optionally with amplification of signalstrength. The transcribed sequence is encoded along the Xpandomerbackbone in high signal-to-noise reporters that are separated by ˜10 nmand are designed for high-signal-to-noise, well-differentiatedresponses. These differences provide significant performanceenhancements in sequence read efficiency and accuracy of Xpandomersrelative to native DNA. Xpandomers can enable several next generationDNA sequencing technologies and are well suited to nanopore sequencing.As discussed above, one method of Xpandomer synthesis uses XNTPs asnucleic acid analogs to extend the template-dependent synthesis and usesa DNA polymerase variant as a catalyst.

Mutating Polymerases

Various types of mutagenesis are optionally used in the presentinvention, e.g., to modify polymerases to produce variants, e.g., inaccordance with polymerase models and model predictions as discussedabove, or using random or semi-random mutational approaches. In general,any available mutagenesis procedure can be used for making polymerasemutants. Such mutagenesis procedures optionally include selection ofmutant nucleic acids and polypeptides for one or more activity ofinterest (e.g., the ability to incorporate bulky nucleotide analogs intoa daughter nucleic acid strand). Procedures that can be used include,but are not limited to: site-directed point mutagenesis, random pointmutagenesis, in vitro or in vivo homologous recombination (DNA shufflingand combinatorial overlap PCR), mutagenesis using uracil containingtemplates, oligonucleotide-directed mutagenesis,phosphorothioate-modified DNA mutagenesis, mutagenesis using gappedduplex DNA, point mismatch repair, mutagenesis using repair-deficienthost strains, restriction-selection and restriction-purification,deletion mutagenesis, mutagenesis by total gene synthesis, degeneratePCR, double-strand break repair, and many others known to persons ofskill. The starting polymerase for mutation can be any of those notedherein, including wildtype DPO4 polymerase.

Optionally, mutagenesis can be guided by known information (e.g.,“rational” or “semi-rational” design) from a naturally occurringpolymerase molecule, or of a known altered or mutated polymerase (e.g.,using an existing mutant polymerase as noted in the precedingreferences), e.g., sequence, sequence comparisons, physical properties,crystal structure and/or the like as discussed above. However, inanother class of embodiments, modification can be essentially random(e.g., as in classical or “family” DNA shuffling, see, e.g., Crameri etal. (1998) “DNA shuffling of a family of genes from diverse speciesaccelerates directed evolution” Nature 391:288-291.

Additional information on mutation formats is found in: Sambrook et al.,Molecular Cloning—A Laboratory Manual (3rd Ed.), Vol. 1-3, Cold SpringHarbor Laboratory, Cold Spring Harbor, N.Y., 2000 (“Sambrook”); CurrentProtocols in Molecular Biology, F. M. Ausubel et al., eds., CurrentProtocols, a joint venture between Greene Publishing Associates, Inc.and John Wiley & Sons, Inc., (supplemented through 2011) (“Ausubel”))and PCR Protocols A Guide to Methods and Applications (Innis et al. eds)Academic Press Inc. San Diego, Calif. (1990) (“Innis”). The followingpublications and references cited within provide additional detail onmutation formats: Arnold, Protein engineering for unusual environments,Current Opinion in Biotechnology 4:450-455 (1993); Bass et al., MutantTrp repressors with new DNA-binding specificities, Science 242:240-245(1988); Bordo and Argos (1991) Suggestions for “Safe” ResidueSubstitutions in Site-directed Mutagenesis 217:721-729; Botstein &Shortle, Strategies and applications of in vitro mutagenesis, Science229:1193-1201 (1985); Carter et al., Improved oligonucleotidesite-directed mutagenesis using M13 vectors, Nucl. Acids Res. 13:4431-4443 (1985); Carter, Site-directed mutagenesis, Biochem. J. 237:1-7(1986); Carter, Improved oligonucleotide-directed mutagenesis using M13vectors, Methods in Enzymol. 154: 382-403 (1987); Dale et al.,Oligonucleotide-directed random mutagenesis using the phosphorothioatemethod, Methods Mol. Biol. 57:369-374 (1996); Eghtedarzadeh & Henikoff,Use of oligonucleotides to generate large deletions, Nucl. Acids Res.14: 5115 (1986); Fritz et al., Oligonucleotide-directed construction ofmutations: a gapped duplex DNA procedure without enzymatic reactions invitro, Nucl. Acids Res. 16: 6987-6999 (1988); Grundstrom et al.,Oligonucleotide-directed mutagenesis by microscale ‘shot-gun’ genesynthesis, Nucl. Acids Res. 13: 3305-3316 (1985); Hayes (2002) CombiningComputational and Experimental Screening for rapid Optimization ofProtein Properties PNAS 99(25) 15926-15931; Kunkel, The efficiency ofoligonucleotide directed mutagenesis, in Nucleic Acids & MolecularBiology (Eckstein, F. and Lilley, D. M. J. eds., Springer Verlag,Berlin)) (1987); Kunkel, Rapid and efficient site-specific mutagenesiswithout phenotypic selection, Proc. Natl. Acad. Sci. USA 82:488-492(1985); Kunkel et al., Rapid and efficient site-specific mutagenesiswithout phenotypic selection, Methods in Enzymol. 154, 367-382 (1987);Kramer et al., The gapped duplex DNA approach tooligonucleotide-directed mutation construction, Nucl. Acids Res. 12:9441-9456 (1984); Kramer & Fritz Oligonucleotide-directed constructionof mutations via gapped duplex DNA, Methods in Enzymol. 154:350-367(1987); Kramer et al., Point Mismatch Repair, Cell 38:879-887 (1984);Kramer et al., Improved enzymatic in vitro reactions in the gappedduplex DNA approach to oligonucleotide-directed construction ofmutations, Nucl. Acids Res. 16: 7207 (1988); Ling et al., Approaches toDNA mutagenesis: an overview, Anal Biochem. 254(2): 157-178 (1997);Lorimer and Pastan Nucleic Acids Res. 23, 3067-8 (1995); Mandecki,Oligonucleotide-directed double-strand break repair in plasmids ofEscherichia coli: a method for site-specific mutagenesis, Proc. Natl.Acad. Sci. USA, 83:7177-7181(1986); Nakamaye & Eckstein, Inhibition ofrestriction endonuclease Nci I cleavage by phosphorothioate groups andits application to oligonucleotide-directed mutagenesis, Nucl. AcidsRes. 14: 9679-9698 (1986); Nambiar et al., Total synthesis and cloningof a gene coding for the ribonuclease S protein, Science 223:1299-1301(1984); Sakamar and Khorana, Total synthesis and expression ofa gene for the a-subunit of bovine rod outer segment guaninenucleotide-binding protein (transducin), Nucl. Acids Res. 14: 6361-6372(1988); Sayers et al., Y-T Exonucleases in phosphorothioate-basedoligonucleotide-directed mutagenesis, Nucl. Acids Res. 16:791-802(1988); Sayers et al., Strand specific cleavage ofphosphorothioate-containing DNA by reaction with restrictionendonucleases in the presence of ethidium bromide, (1988) Nucl. AcidsRes. 16: 803-814; Sieber, et al., Nature Biotechnology, 19:456-460(2001); Smith, In vitro mutagenesis, Ann. Rev. Genet. 19:423-462 (1985);Methods in Enzymol. 100: 468-500 (1983); Methods in Enzymol. 154:329-350 (1987); Stemmer, Nature 370, 389-91(1994); Taylor et al., Theuse of phosphorothioate-modified DNA in restriction enzyme reactions toprepare nicked DNA, Nucl. Acids Res. 13: 8749-8764 (1985); Taylor etal., The rapid generation of oligonucleotide-directed mutations at highfrequency using phosphorothioate-modified DNA, Nucl. Acids Res. 13:8765-8787 (1985); Wells et al., Importance of hydrogen-bond formation instabilizing the transition state of subtilisin, Phil. Trans. R. Soc.Lond. A 317: 415-423 (1986); Wells et al., Cassette mutagenesis: anefficient method for generation of multiple mutations at defined sites,Gene 34:315-323 (1985); Zoller & Smith, Oligonucleotide-directedmutagenesis using M 13-derived vectors: an efficient and generalprocedure for the production of point mutations in any DNA fragment,Nucleic Acids Res. 10:6487-6500 (1982); Zoller & Smith,Oligonucleotide-directed mutagenesis of DNA fragments cloned into M13vectors, Methods in Enzymol. 100:468-500 (1983); Zoller & Smith,Oligonucleotide-directed mutagenesis: a simple method using twooligonucleotide primers and a single-stranded DNA template, Methods inEnzymol. 154:329-350 (1987); Clackson et al. (1991) “Making antibodyfragments using phage display libraries” Nature 352:624-628; Gibbs etal. (2001) “Degenerate oligonucleotide gene shuffling (DOGS): a methodfor enhancing the frequency of recombination with family shuffling” Gene271:13-20; and Hiraga and Arnold (2003) “General method forsequence-independent site-directed chimeragenesis: J. Mol. Biol.330:287-296. Additional details on many of the above methods can befound in Methods in Enzymology Volume 154, which also describes usefulcontrols for trouble-shooting problems with various mutagenesis methods.

Screening Polymerases

Screening or other protocols can be used to determine whether apolymerase displays a modified activity, e.g., for a nucleotide analog,as compared to a parental DNA polymerase. For example, the ability tobind and incorporate bulky nucleotide analogs into a daughter strandduring template-dependent DNA synthesis. Assays for such properties, andthe like, are described herein. Performance of a recombinant polymerasein a primer extension reaction can be examined to assay properties suchas nucleotide analog incorporations etc., as described herein.

In one desirable aspect, a library of recombinant DNA polymerases can bemade and screened for these properties. For example, a plurality ofmembers of the library can be made to include one or more mutation thatalters incorporations and/or randomly generated mutations (e.g., wheredifferent members include different mutations or different combinationsof mutations), and the library can then be screened for the propertiesof interest (e.g., incorporations, etc.). In general, the library can bescreened to identify at least one member comprising a modified activityof interest.

Libraries of polymerases can be either physical or logical in nature.Moreover, any of a wide variety of library formats can be used. Forexample, polymerases can be fixed to solid surfaces in arrays ofproteins. Similarly, liquid phase arrays of polymerases (e.g., inmicrowell plates) can be constructed for convenient high-throughputfluid manipulations of solutions comprising polymerases. Liquid,emulsion, or gel-phase libraries of cells that express recombinantpolymerases can also be constructed, e.g., in microwell plates, or onagar plates. Phage display libraries of polymerases or polymerasedomains (e.g., including the active site region or interdomain stabilityregions) can be produced. Likewise, yeast display libraries can be used.Instructions in making and using libraries can be found, e.g., inSambrook, Ausubel and Berger, referenced herein.

For the generation of libraries involving fluid transfer to or frommicrotiter plates, a fluid handling station is optionally used. Several“off the shelf” fluid handling stations for performing such transfersare commercially available, including e.g., the Zymate systems fromCaliper Life Sciences (Hopkinton, Mass.) and other stations whichutilize automatic pipettors, e.g., in conjunction with the robotics forplate movement (e.g., the ORCA® robot, which is used in a variety oflaboratory systems available, e.g., from Beckman Coulter, Inc.(Fullerton, Calif.).

In an alternate embodiment, fluid handling is performed in microchips,e.g., involving transfer of materials from microwell plates or otherwells through microchannels on the chips to destination sites(microchannel regions, wells, chambers or the like). Commerciallyavailable microfluidic systems include those fromHewlett-Packard/Agilent Technologies (e.g., the HP2100 bioanalyzer) andthe Caliper High Throughput Screening System. The Caliper HighThroughput Screening System provides one example interface betweenstandard microwell library formats and Labchip technologies. RainDanceTechnologies' nanodroplet platform provides another method for handlinglarge numbers of spatially separated reactions. Furthermore, the patentand technical literature includes many examples of microfluidic systemswhich can interface directly with microwell plates for fluid handling.

Tags and Other Optional Polymerase Features

The recombinant DNA polymerase optionally includes additional featuresexogenous or heterologous to the polymerase. For example, therecombinant polymerase optionally includes one or more tags, e.g.,purification, substrate binding, or other tags, such as a polyhistidinetag, a His10 tag, a His6 tag, an alanine tag, an Ala16 tag, an Ala16tag, a biotin tag, a biotin ligase recognition sequence or other biotinattachment site (e.g., a BiTag or a Btag or variant thereof, e.g.,BtagV1-11), a GST tag, an S Tag, a SNAP-tag, an HA tag, a DSB (Sso7D)tag, a lysine tag, a NanoTag, a Cmyc tag, a tag or linker comprising theamino acids glycine and serine, a tag or linker comprising the aminoacids glycine, serine, alanine and histidine, a tag or linker comprisingthe amino acids glycine, arginine, lysine, glutamine and proline, aplurality of polyhistidine tags, a plurality of His10 tags, a pluralityof His6 tags, a plurality of alanine tags, a plurality of Ala10 tags, aplurality of Ala16 tags, a plurality of biotin tags, a plurality of GSTtags, a plurality of BiTags, a plurality of S Tags, a plurality ofSNAP-tags, a plurality of HA tags, a plurality of DSB (Sso7D) tags, aplurality of lysine tags, a plurality of NanoTags, a plurality of Cmyctags, a plurality of tags or linkers comprising the amino acids glycineand serine, a plurality of tags or linkers comprising the amino acidsglycine, serine, alanine and histidine, a plurality of tags or linkerscomprising the amino acids glycine, arginine, lysine, glutamine andproline, biotin, avidin, an antibody or antibody domain, antibodyfragment, antigen, receptor, receptor domain, receptor fragment, orligand, one or more protease site (e.g., Factor Xa, enterokinase, orthrombin site), a dye, an acceptor, a quencher, a DNA binding domain(e.g., a helix-hairpin-helix domain from topoisomerase V), orcombination thereof. The one or more exogenous or heterologous featuresat the N- and/or C-terminal regions of the polymerase can find use notonly for purification purposes, immobilization of the polymerase to asubstrate, and the like, but can also be useful for altering one or moreproperties of the polymerase.

The one or more exogenous or heterologous features can be includedinternal to the polymerase, at the N-terminal region of the polymerase,at the C-terminal region of the polymerase, or both the N-terminal andC-terminal regions of the polymerase. Where the polymerase includes anexogenous or heterologous feature at both the N-terminal and C-terminalregions, the exogenous or heterologous features can be the same (e.g., apolyhistidine tag, e.g., a His10 tag, at both the N- and C-terminalregions) or different (e.g., a biotin ligase recognition sequence at theN-terminal region and a polyhistidine tag, e.g., His10 tag, at theC-terminal region). Optionally, a terminal region (e.g., the N- orC-terminal region) of a polymerase of the invention can comprise two ormore exogenous or heterologous features which can be the same ordifferent (e.g., a biotin ligase recognition sequence and apolyhistidine tag at the N-terminal region, a biotin ligase recognitionsequence, a polyhistidine tag, and a Factor Xa recognition site at theN-terminal region, and the like). As a few examples, the polymerase caninclude a polyhistidine tag at the C-terminal region, a biotin ligaserecognition sequence and a polyhistidine tag at the N-terminal region, abiotin ligase recognition sequence and a polyhistidine tag at theN-terminal region and a polyhistidine tag at the C-terminal region, or apolyhistidine tag and a biotin ligase recognition sequence at theC-terminal region.

Making and Isolating Recombinant Polymerases

Generally, nucleic acids encoding a polymerase of the invention can bemade by cloning, recombination, in vitro synthesis, in vitroamplification and/or other available methods. A variety of recombinantmethods can be used for expressing an expression vector that encodes apolymerase of the invention. Methods for making recombinant nucleicacids, expression and isolation of expressed products are well known anddescribed in the art. A number of exemplary mutations and combinationsof mutations, as well as strategies for design of desirable mutations,are described herein. Methods for making and selecting mutations in theactive site of polymerases, including for modifying steric features inor near the active site to permit improved access by nucleotide analogsare found hereinabove and, e.g., in PCT Publication Nos. WO 2007/076057and WO 2008/051530.

Additional useful references for mutation, recombinant and in vitronucleic acid manipulation methods (including cloning, expression, PCR,and the like) include Berger and Kimmel, Guide to Molecular CloningTechniques, Methods in Enzymology volume 152 Academic Press, Inc., SanDiego, Calif. (Berger); Kaufman et al. (2003) Handbook of Molecular andCellular Methods in Biology and Medicine Second Edition Ceske (ed) CRCPress (Kaufman); and The Nucleic Acid Protocols Handbook Ralph Rapley(ed) (2000) Cold Spring Harbor, Humana Press Inc (Rapley); Chen et al.(ed) PCR Cloning Protocols, Second Edition (Methods in MolecularBiology, volume 192) Humana Press; and in Viljoen et al. (2005)Molecular Diagnostic PCR Handbook Springer, ISBN 1402034032.

In addition, a plethora of kits are commercially available for thepurification of plasmids or other relevant nucleic acids from cells,(see, e.g., EasyPrep™ F1exiPrep™ both from Pharmacia Biotech;StrataClean™, from Stratagene; and, QIAprep™ from Qiagen). Any isolatedand/or purified nucleic acid can be further manipulated to produce othernucleic acids, used to transfect cells, incorporated into relatedvectors to infect organisms for expression, and/or the like. Typicalcloning vectors contain transcription and translation terminators,transcription and translation initiation sequences, and promoters usefulfor regulation of the expression of the particular target nucleic acid.The vectors optionally comprise generic expression cassettes containingat least one independent terminator sequence, sequences permittingreplication of the cassette in eukaryotes, or prokaryotes, or both,(e.g., shuttle vectors) and selection markers for both prokaryotic andeukaryotic systems. Vectors are suitable for replication and integrationin prokaryotes, eukaryotes, or both.

Other useful references, e.g. for cell isolation and culture (e.g., forsubsequent nucleic acid isolation) include Freshney (1994) Culture ofAnimal Cells, a Manual of Basic Technique, third edition, Wiley-Liss,New York and the references cited therein; Payne et al. (1992) PlantCell and Tissue Culture in Liquid Systems John Wiley & Sons, Inc. NewYork, N.Y.; Gamborg and Phillips (eds) (1995) Plant Cell, Tissue andOrgan Culture; Fundamental Methods Springer Lab Manual, Springer-Verlag(Berlin Heidelberg New York) and Atlas and Parks (eds) The Handbook ofMicrobiological Media (1993) CRC Press, Boca Raton, Fla.

Nucleic acids encoding the recombinant polymerases of the invention arealso a feature of the invention. A particular amino acid can be encodedby multiple codons, and certain translation systems (e.g., prokaryoticor eukaryotic cells) often exhibit codon bias, e.g., different organismsoften prefer one of the several synonymous codons that encode the sameamino acid. As such, nucleic acids of the invention are optionally“codon optimized,” meaning that the nucleic acids are synthesized toinclude codons that are preferred by the particular translation systembeing employed to express the polymerase. For example, when it isdesirable to express the polymerase in a bacterial cell (or even aparticular strain of bacteria), the nucleic acid can be synthesized toinclude codons most frequently found in the genome of that bacterialcell, for efficient expression of the polymerase. A similar strategy canbe employed when it is desirable to express the polymerase in aeukaryotic cell, e.g., the nucleic acid can include codons preferred bythat eukaryotic cell.

A variety of protein isolation and detection methods are known and canbe used to isolate polymerases, e.g., from recombinant cultures of cellsexpressing the recombinant polymerases of the invention. A variety ofprotein isolation and detection methods are well known in the art,including, e.g., those set forth in R. Scopes, Protein Purification,Springer-Verlag, N.Y. (1982); Deutscher, Methods in Enzymology Vol. 182:Guide to Protein Purification, Academic Press, Inc. N.Y. (1990); Sandana(1997) Bioseparation of Proteins, Academic Press, Inc.; Bollag et al.(1996) Protein Methods, 2.sup.nd Edition Wiley-Liss, NY; Walker (1996)The Protein Protocols Handbook Humana Press, NJ, Harris and Angal (1990)Protein Purification Applications: A Practical Approach IRL Press atOxford, Oxford, England; Harris and Angal Protein Purification Methods:A Practical Approach IRL Press at Oxford, Oxford, England; Scopes (1993)Protein Purification: Principles and Practice 3.sup.rd Edition SpringerVerlag, NY; Janson and Ryden (1998) Protein Purification: Principles,High Resolution Methods and Applications, Second Edition Wiley-VCH, NY;and Walker (1998) Protein Protocols on CD-ROM Humana Press, NJ; and thereferences cited therein. Additional details regarding proteinpurification and detection methods can be found in Satinder Ahuja ed.,Handbook of Bioseparations, Academic Press (2000).

Nucleic Acid and Polypeptide Sequences and Variants

As described herein, the invention also features polynucleotidesequences encoding, e.g., a polymerase as described herein. Examples ofpolymerase sequences that include features found herein, e.g., as inTable 2 are provided. However, one of skill in the art will immediatelyappreciate that the invention is not limited to the specificallyexemplified sequences. For example, one of skill will appreciate thatthe invention also provides, e.g., many related sequences with thefunctions described herein, e.g., polynucleotides and polypeptidesencoding conservative variants of a polymerase of Tables 2 and 3 or anyother specifically listed polymerase herein. Combinations of any of themutations noted herein are also features of the invention.

Accordingly, the invention provides a variety of polypeptides(polymerases) and polynucleotides (nucleic acids that encodepolymerases). Exemplary polynucleotides of the invention include, e.g.,any polynucleotide that encodes a polymerase of Table 2 or otherwisedescribed herein. Because of the degeneracy of the genetic code, manypolynucleotides equivalently encode a given polymerase sequence.Similarly, an artificial or recombinant nucleic acid that hybridizes toa polynucleotide indicated above under highly stringent conditions oversubstantially the entire length of the nucleic acid (and is other than anaturally occurring polynucleotide) is a polynucleotide of theinvention. In one embodiment, a composition includes a polypeptide ofthe invention and an excipient (e.g., buffer, water, pharmaceuticallyacceptable excipient, etc.). The invention also provides an antibody orantisera specifically immunoreactive with a polypeptide of the invention(e.g., that specifically recognizes a feature of the polymerase thatconfers decreased branching or increased complex stability.

In certain embodiments, a vector (e.g., a plasmid, a cosmid, a phage, avirus, etc.) comprises a polynucleotide of the invention. In oneembodiment, the vector is an expression vector. In another embodiment,the expression vector includes a promoter operably linked to one or moreof the polynucleotides of the invention. In another embodiment, a cellcomprises a vector that includes a polynucleotide of the invention.

One of skill will also appreciate that many variants of the disclosedsequences are included in the invention. For example, conservativevariations of the disclosed sequences that yield a functionally similarsequence are included in the invention. Variants of the nucleic acidpolynucleotide sequences, wherein the variants hybridize to at least onedisclosed sequence, are considered to be included in the invention.Unique subsequences of the sequences disclosed herein, as determined by,e.g., standard sequence comparison techniques, are also included in theinvention.

Conservative Variations

Owing to the degeneracy of the genetic code, “silent substitutions”(i.e., substitutions in a nucleic acid sequence which do not result inan alteration in an encoded polypeptide) are an implied feature of everynucleic acid sequence that encodes an amino acid sequence. Similarly,“conservative amino acid substitutions,” where one or a limited numberof amino acids in an amino acid sequence are substituted with differentamino acids with highly similar properties, are also readily identifiedas being highly similar to a disclosed construct. Such conservativevariations of each disclosed sequence are a feature of the presentinvention.

“Conservative variations” of a particular nucleic acid sequence refersto those nucleic acids which encode identical or essentially identicalamino acid sequences, or, where the nucleic acid does not encode anamino acid sequence, to essentially identical sequences. One of skillwill recognize that individual substitutions, deletions or additionswhich alter, add or delete a single amino acid or a small percentage ofamino acids (typically less than 5%, more typically less than 4%, 2% or1%) in an encoded sequence are “conservatively modified variations”where the alterations result in the deletion of an amino acid, additionof an amino acid, or substitution of an amino acid with a chemicallysimilar amino acid, while retaining the relevant mutational feature (forexample, the conservative substitution can be of a residue distal to theactive site region, or distal to an interdomain stability region). Thus,“conservative variations” of a listed polypeptide sequence of thepresent invention include substitutions of a small percentage, typicallyless than 5%, more typically less than 2% or 1%, of the amino acids ofthe polypeptide sequence, with an amino acid of the same conservativesubstitution group. Finally, the addition of sequences which do notalter the encoded activity of a nucleic acid molecule, such as theaddition of a non-functional or tagging sequence (introns in the nucleicacid, poly His or similar sequences in the encoded polypeptide, etc.),is a conservative variation of the basic nucleic acid or polypeptide.

Conservative substitution tables providing functionally similar aminoacids are well known in the art, where one amino acid residue issubstituted for another amino acid residue having similar chemicalproperties (e.g., aromatic side chains or positively charged sidechains), and therefore does not substantially change the functionalproperties of the polypeptide molecule. The following sets forth examplegroups that contain natural amino acids of like chemical properties,where substitutions within a group is a “conservative substitution”.

TABLE 1 Conservative Amino Acid Substitutions Nonpolar Polar, and/oruncharged Positively Negatively aliphatic side Aromatic side chargedside charged side side chains chains chains chains chains Glycine SerinePhenylalanine Lysine Aspartate Alanine Threonine Tyrosine ArginineGlutamate Valine Cysteine Tryptophan Histidine Leucine MethionineIsoleucine Asparagine Proline Glutamine

Nucleic Acid Hybridization

Comparative hybridization can be used to identify nucleic acids of theinvention, including conservative variations of nucleic acids of theinvention. In addition, target nucleic acids which hybridize to anucleic acid of the invention under high, ultra-high and ultra-ultrahigh stringency conditions, where the nucleic acids encode mutantscorresponding to those noted in Tables 2 and 3 or other listedpolymerases, are a feature of the invention. Examples of such nucleicacids include those with one or a few silent or conservative nucleicacid substitutions as compared to a given nucleic acid sequence encodinga polymerase of Table 2 (or other exemplified polymerase), where anyconservative substitutions are for residues other than those noted inTable 2 or elsewhere as being relevant to a feature of interest(improved nucleotide analog incorporations, etc.).

A test nucleic acid is said to specifically hybridize to a probe nucleicacid when it hybridizes at least 50% as well to the probe as to theperfectly matched complementary target, i.e., with a signal to noiseratio at least half as high as hybridization of the probe to the targetunder conditions in which the perfectly matched probe binds to theperfectly matched complementary target with a signal to noise ratio thatis at least about 5×-10× as high as that observed for hybridization toany of the unmatched target nucleic acids.

Nucleic acids “hybridize” when they associate, typically in solution.Nucleic acids hybridize due to a variety of well characterizedphysico-chemical forces, such as hydrogen bonding, solvent exclusion,base stacking and the like. An extensive guide to the hybridization ofnucleic acids is found in Tijssen (1993) Laboratory Techniques inBiochemistry and Molecular Biology—Hybridization with Nucleic AcidProbes part I chapter 2, “Overview of principles of hybridization andthe strategy of nucleic acid probe assays,” (Elsevier, N.Y.), as well asin Current Protocols in Molecular Biology, Ausubel et al., eds., CurrentProtocols, a joint venture between Greene Publishing Associates, Inc.and John Wiley & Sons, Inc., (supplemented through 2011); Hames andHiggins (1995) Gene Probes 1 IRL Press at Oxford University Press,Oxford, England, (Hames and Higgins 1) and Hames and Higgins (1995) GeneProbes 2 IRL Press at Oxford University Press, Oxford, England (Hamesand Higgins 2) provide details on the synthesis, labeling, detection andquantification of DNA and RNA, including oligonucleotides.

An example of stringent hybridization conditions for hybridization ofcomplementary nucleic acids which have more than 100 complementaryresidues on a filter in a Southern or northern blot is 50% formalin with1 mg of heparin at 42° C. with the hybridization being carried outovernight. An example of stringent wash conditions is a 0.2×SSC wash at65° C. for 15 minutes (see, Sambrook, supra for a description of SSCbuffer). Often the high stringency wash is preceded by a low stringencywash to remove background probe signal. An example low stringency washis 2×SSC at 40° C. for 15 minutes. In general, a signal to noise ratioof 5× (or higher) than that observed for an unrelated probe in theparticular hybridization assay indicates detection of a specifichybridization.

“Stringent hybridization wash conditions” in the context of nucleic acidhybridization experiments such as Southern and northern hybridizationsare sequence dependent, and are different under different environmentalparameters. An extensive guide to the hybridization of nucleic acids isfound in Tijssen (1993), supra. and in Hames and Higgins, 1 and 2.Stringent hybridization and wash conditions can easily be determinedempirically for any test nucleic acid. For example, in determiningstringent hybridization and wash conditions, the hybridization and washconditions are gradually increased (e.g., by increasing temperature,decreasing salt concentration, increasing detergent concentration and/orincreasing the concentration of organic solvents such as formalin in thehybridization or wash), until a selected set of criteria are met. Forexample, in highly stringent hybridization and wash conditions, thehybridization and wash conditions are gradually increased until a probebinds to a perfectly matched complementary target with a signal to noiseratio that is at least 5× as high as that observed for hybridization ofthe probe to an unmatched target

“Very stringent” conditions are selected to be equal to the thermalmelting point (T_(m)) for a particular probe. The T_(m) is thetemperature (under defined ionic strength and pH) at which 50% of thetest sequence hybridizes to a perfectly matched probe. For the purposesof the present invention, generally, “highly stringent” hybridizationand wash conditions are selected to be about 5° C. lower than the T_(m)for the specific sequence at a defined ionic strength and pH.

“Ultra high-stringency” hybridization and wash conditions are those inwhich the stringency of hybridization and wash conditions are increaseduntil the signal to noise ratio for binding of the probe to theperfectly matched complementary target nucleic acid is at least 10× ashigh as that observed for hybridization to any of the unmatched targetnucleic acids. A target nucleic acid which hybridizes to a probe undersuch conditions, with a signal to noise ratio of at least ½ that of theperfectly matched complementary target nucleic acid is said to bind tothe probe under ultra-high stringency conditions.

Similarly, even higher levels of stringency can be determined bygradually increasing the hybridization and/or wash conditions of therelevant hybridization assay. For example, those in which the stringencyof hybridization and wash conditions are increased until the signal tonoise ratio for binding of the probe to the perfectly matchedcomplementary target nucleic acid is at least 10×, 20×, 50×, 100×, or500× or more as high as that observed for hybridization to any of theunmatched target nucleic acids. A target nucleic acid which hybridizesto a probe under such conditions, with a signal to noise ratio of atleast ½ that of the perfectly matched complementary target nucleic acidis said to bind to the probe under ultra-ultra-high stringencyconditions.

Nucleic acids that do not hybridize to each other under stringentconditions are still substantially identical if the polypeptides whichthey encode are substantially identical. This occurs, e.g., when a copyof a nucleic acid is created using the maximum codon degeneracypermitted by the genetic code.

Sequence Comparison, Identity, and Homology

The terms “identical” or “percent identity,” in the context of two ormore nucleic acid or polypeptide sequences, refer to two or moresequences or subsequences that are the same or have a specifiedpercentage of amino acid residues or nucleotides that are the same, whencompared and aligned for maximum correspondence, as measured using oneof the sequence comparison algorithms described below (or otheralgorithms available to persons of skill) or by visual inspection.

The phrase “substantially identical,” in the context of two nucleicacids or polypeptides (e.g., DNAs encoding a polymerase, or the aminoacid sequence of a polymerase) refers to two or more sequences orsubsequences that have at least about 60%, about 80%, about 90-95%,about 98%, about 99% or more nucleotide or amino acid residue identity,when compared and aligned for maximum correspondence, as measured usinga sequence comparison algorithm or by visual inspection. Such“substantially identical” sequences are typically considered to be“homologous,” without reference to actual ancestry. Preferably, the“substantial identity” exists over a region of the sequences that is atleast about 50 residues in length, more preferably over a region of atleast about 100 residues, and most preferably, the sequences aresubstantially identical over at least about 150 residues, or over thefull length of the two sequences to be compared.

Proteins and/or protein sequences are “homologous” when they arederived, naturally or artificially, from a common ancestral protein orprotein sequence. Similarly, nucleic acids and/or nucleic acid sequencesare homologous when they are derived, naturally or artificially, from acommon ancestral nucleic acid or nucleic acid sequence. Homology isgenerally inferred from sequence similarity between two or more nucleicacids or proteins (or sequences thereof). The precise percentage ofsimilarity between sequences that is useful in establishing homologyvaries with the nucleic acid and protein at issue, but as little as 25%sequence similarity over 50, 100, 150 or more residues is routinely usedto establish homology. Higher levels of sequence similarity, e.g., 30%,40%, 50%, 60%, 70%, 80%, 90%, 95%, or 99% or more identity, can also beused to establish homology. Methods for determining sequence similaritypercentages (e.g., BLASTP and BLASTN using default parameters) aredescribed herein and are generally available.

For sequence comparison and homology determination, typically onesequence acts as a reference sequence to which test sequences arecompared. When using a sequence comparison algorithm, test and referencesequences are input into a computer, subsequence coordinates aredesignated, if necessary, and sequence algorithm program parameters aredesignated. The sequence comparison algorithm then calculates thepercent sequence identity for the test sequence(s) relative to thereference sequence, based on the designated program parameters.

Optimal alignment of sequences for comparison can be conducted, e.g., bythe local homology algorithm of Smith & Waterman, Adv. Appl. Math. 2:482(1981), by the homology alignment algorithm of Needleman & Wunsch, J.Mol. Biol. 48:443 (1970), by the search for similarity method of Pearson& Lipman, Proc. Nat'l. Acad. Sci. USA 85:2444 (1988), by computerizedimplementations of these algorithms (GAP, BESTFIT, FASTA, and TFASTA inthe Wisconsin Genetics Software Package, Genetics Computer Group, 575Science Dr., Madison, Wis.), or by visual inspection (see generallyCurrent Protocols in Molecular Biology, Ausubel et al., eds., CurrentProtocols, a joint venture between Greene Publishing Associates, Inc.and John Wiley & Sons, Inc., supplemented through 2011).

One example of an algorithm that is suitable for determining percentsequence identity and sequence similarity is the BLAST algorithm, whichis described in Altschul et al., J. Mol. Biol. 215:403-410 (1990).Software for performing BLAST analyses is publicly available through theNational Center for Biotechnology Information. This algorithm involvesfirst identifying high scoring sequence pairs (HSPs) by identifyingshort words of length W in the query sequence, which either match orsatisfy some positive-valued threshold score T when aligned with a wordof the same length in a database sequence. T is referred to as theneighborhood word score threshold (Altschul et al., supra). Theseinitial neighborhood word hits act as seeds for initiating searches tofind longer HSPs containing them. The word hits are then extended inboth directions along each sequence for as far as the cumulativealignment score can be increased. Cumulative scores are calculatedusing, for nucleotide sequences, the parameters M (reward score for apair of matching residues; always >0) and N (penalty score formismatching residues; always <0). For amino acid sequences, a scoringmatrix is used to calculate the cumulative score. Extension of the wordhits in each direction are halted when: the cumulative alignment scorefalls off by the quantity X from its maximum achieved value; thecumulative score goes to zero or below, due to the accumulation of oneor more negative-scoring residue alignments; or the end of eithersequence is reached. The BLAST algorithm parameters W, T, and Xdetermine the sensitivity and speed of the alignment. The BLASTN program(for nucleotide sequences) uses as defaults a word length (W) of 11, anexpectation (E) of 10, a cutoff of 100, M=5, N=−4, and a comparison ofboth strands. For amino acid sequences, the BLASTP program uses asdefaults a word length (W) of 3, an expectation (E) of 10, and theBLOSUM62 scoring matrix (see Henikoff & Henikoff (1989) Proc. Natl.Acad. Sci. USA 89:10915).

In addition to calculating percent sequence identity, the BLASTalgorithm also performs a statistical analysis of the similarity betweentwo sequences (see, e.g., Karlin & Altschul (1993) Proc. Nat'l. Acad.Sci. USA 90:5873-5787). One measure of similarity provided by the BLASTalgorithm is the smallest sum probability (P(N)), which provides anindication of the probability by which a match between two nucleotide oramino acid sequences would occur by chance. For example, a nucleic acidis considered similar to a reference sequence if the smallest sumprobability in a comparison of the test nucleic acid to the referencenucleic acid is less than about 0.1, more preferably less than about0.01, and most preferably less than about 0.001.

For reference, the amino acid sequence of a wild-type DPO4 polymerase ispresented in Table 2.

Exemplary Mutation Combinations

A list of exemplary polymerase mutation combinations and the amino acidsequences of recombinant DPO4 polymerases harboring the exemplarymutation combinations are provided in Tables 2 and 3. Positions of aminoacid substitutions are identified relative to a wildtype DPO4 DNApolymerase (SEQ ID NO:1). Polymerases of the invention (including thoseprovided in Tables 2 and 3) can include any exogenous or heterologousfeature (or combination of such features) at the N- and/or C-terminalregion. For example, it will be understood that polymerase mutants inTables 2 and 3 that do not include, e.g., a C-terminal polyhistidine tagcan be modified to include a polyhistidine tag at the C-terminal region,alone or in combination with any of the exogenous or heterologousfeatures described herein. Any of the variants set forth herein may alsoinclude a deletion of the last 12 amino acids of the protein (i.e.,amino acids 341-352) so as to, e.g., increase protein solubility inbacterial expression systems.

TABLE 2 DPO4 Variants Identified through Random Mutagenesis SEQ ID NOAmino Acid Sequence  1 MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGRFEDSGAVAwt DPO4 DNA polymerase TANYEARKFGVKAGIPIVEAKKILPNAVYLPMRKEVYQQVSSRIMNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKILEKEKITVTVGISKNKVFAKIAADMAKPNGIKVIDDEEVKRLIRELDIADVPGIGNITAEKLKKLGINKLVDTLSIEFDKLKGMIGEAKAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPYLFRAIEESYYKLDKRIPKAIHVVAVTEDLDIVSRGRTFPHGISKETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT  2MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGRFEDSGAVA SGM-0001.6TANYEARKFGVKAGIPIVEAKKILPNAVYLPVRQMVYNRVSLRI M76V_K78Q_E79M_Q82N_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI Q83R_S86LLEKEKITVTVGISKNKVFAKIAADMAKPNGIKVIDDEEVKRLIRELDIADVPGIGNITAEKLKKLGINKLVDTLSIEFDKLKGMIGEAKAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPYLFRAIEESYYKLDKRIPKAIHVVAVTEDLDIVSRGRTFPHGISKETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT  3MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGRFEDSGAVA SGM-0009.2TANYEARKFGVKAGIPIVEAKKILPNAVYLPVRTWVYNSVSERI M76V_K78T_E79W_Q82N_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI Q83S_S86ELEKEKITVTVGISKNKVFAKIAADMAKPNGIKVIDDEEVKRLIRELDIADVPGIGNITAEKLKKLGINKLVDTLSIEFDKLKGMIGEAKAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPYLFRAIEESYYKLDKRIPKAIHVVAVTEDLDIVSRGRTFPHGISKETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT  4MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGRFEDSGAVA SGM-0010.02TANYEARKFGVKAGIPIVEAKKILPNAVYLPARLVVYSRVSWRI M76A_K78L_E79V_Q82S_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI Q83R_S86WLEKEKITVTVGISKNKVFAKIAADMAKPNGIKVIDDEEVKRLIRELDIADVPGIGNITAEKLKKLGINKLVDTLSIEFDKLKGMIGEAKAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPYLFRAIEESYYKLDKRIPKAIHVVAVTEDLDIVSRGRTFPHGISKETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT  5MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGRFEDSGAVA SGM-0010.08TANYEARKFGVKAGIPIVEAKKILPNAVYLPSRLNVYHSVSKRI M76S_K78L_E79N_Q82H_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI Q83S_S86KLEKEKITVTVGISKNKVFAKIAADMAKPNGIKVIDDEEVKRLIRELDIADVPGIGNITAEKLKKLGINKLVDTLSIEFDKLKGMIGEAKAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPYLFRAIEESYYKLDKRIPKAIHVVAVTEDLDIVSRGRTFPHGISKETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT  6MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGRFEDSGAVA SGM-0010.17TANYEARKFGVKAGIPIVEAKKILPNAVYLPSRLNVYHSVSNRI M76S_K78L_E79N_Q82H_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI Q83S_S86NLEKEKITVTVGISKNKVFAKIAADMAKPNGIKVIDDEEVKRLIRELDIADVPGIGNITAEKLKKLGINKLVDTLSIEFDKLKGMIGEAKAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPYLFRAIEESYYKLDKRIPKAIHVVAVTEDLDIVSRGRTFPHGISKETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT  7MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGRFEDSGAVA SGM-0010.22TANYEARKFGVKAGIPIVEAKKILPNAVYLPARLYVYDTVSKRI M76A_K78L_E79Y_Q82D_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI Q83T_S86KLEKEKITVTVGISKNKVFAKIAADMAKPNGIKVIDDEEVKRLIRELDIADVPGIGNITAEKLKKLGINKLVDTLSIEFDKLKGMIGEAKAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPYLFRAIEESYYKLDKRIPKAIHVVAVTEDLDIVSRGRTFPHGISKETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT  8MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGRFEDSGAVA SGM-0010.45TANYEARKFGVKAGIPIVEAKKILPNAVYLPARVNVYWSVSSRI M76A_K78V_E79N_Q82W_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI Q83SLEKEKITVTVGISKNKVFAKIAADMAKPNGIKVIDDEEVKRLIRELDIADVPGIGNITAEKLKKLGINKLVDTLSIEFDKLKGMIGEAKAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPYLFRAIEESYYKLDKRIPKAIHVVAVTEDLDIVSRGRTFPHGISKETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT  9MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGRFEDSGAVA SGM-0010.46TANYEARKFGVKAGIPIVEAKKILPNAVYLPLRSVVYEIVSQRI M76L_K78S_E79V_Q82E_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI Q83I_S86QLEKEKITVTVGISKNKVFAKIAADMAKPNGIKVIDDEEVKRLIRELDIADVPGIGNITAEKLKKLGINKLVDTLSIEFDKLKGMIGEAKAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPYLFRAIEESYYKLDKRIPKAIHVVAVTEDLDIVSRGRTFPHGISKETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT 10MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGRFEDSGAVA SGM-0010.62TANYEARKFGVKAGIPIVEAKKILPNAVYLPVRSGVYGEVSKRI M76V_K78S_E79G_Q82G_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI Q83E_S86KLEKEKITVTVGISKNKVFAKIAADMAKPNGIKVIDDEEVKRLIRELDIADVPGIGNITAEKLKKLGINKLVDTLSIEFDKLKGMIGEAKAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPYLFRAIEESYYKLDKRIPKAIHVVAVTEDLDIVSRGRTFPHGISKETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT 11MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGRFEDSGAVA SGM-0010.65TANYEARKFGVKAGIPIVEAKKILPNAVYLPVRSSVYNMVSVRI M76V_K78S_E79S_Q82N_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI Q83M_S86VLEKEKITVTVGISKNKVFAKIAADMAKPNGIKVIDDEEVKRLIRELDIADVPGIGNITAEKLKKLGINKLVDTLSIEFDKLKGMIGEAKAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPYLFRAIEESYYKLDKRIPKAIHVVAVTEDLDIVSRGRTFPHGISKETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT 12MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGRFEDSGAVA SGM-0010.72TANYEARKFGVKAGIPIVEAKKILPNAVYLPARFNVYSSVSMRI M76A_K78F_E79N_Q82S_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI Q83S_S86MLEKEKITVTVGISKNKVFAKIAADMAKPNGIKVIDDEEVKRLIRELDIADVPGIGNITAEKLKKLGINKLVDTLSIEFDKLKGMIGEAKAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPYLFRAIEESYYKLDKRIPKAIHVVAVTEDLDIVSRGRTFPHGISKETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT 13MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGRFEDSGAVA SGM-0010.101TANYEARKFGVKAGIPIVEAKKILPNAVYLPVRELVYMQVSERI M76V_K78E_E79L_Q82M_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI S86ELEKEKITVTVGISKNKVFAKIAADMAKPNGIKVIDDEEVKRLIRELDIADVPGIGNITAEKLKKLGINKLVDTLSIEFDKLKGMIGEAKAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPYLFRAIEESYYKLDKRIPKAIHVVAVTEDLDIVSRGRTFPHGISKETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT 14MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGRFEDSGAVA SGM-0010.105TANYEARKFGVKAGIPIVEAKKILPNAVYLPTRSHVYRDVSTRI M76T_K78S_E79H_Q82R_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI Q83D_S86TLEKEKITVTVGISKNKVFAKIAADMAKPNGIKVIDDEEVKRLIRELDIADVPGIGNITAEKLKKLGINKLVDTLSIEFDKLKGMIGEAKAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPYLFRAIEESYYKLDKRIPKAIHVVAVTEDLDIVSRGRTFPHGISKETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT 15MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGRFEDSGAVA SGM-0010.115TANYEARKFGVKAGIPIVEAKKILPNAVYLPCRMLVYWEVSQRI M76C_K78M_E79L_Q82W_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI Q83E_S86QLEKEKITVTVGISKNKVFAKIAADMAKPNGIKVIDDEEVKRLIRELDIADVPGIGNITAEKLKKLGINKLVDTLSIEFDKLKGMIGEAKAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPYLFRAIEESYYKLDKRIPKAIHVVAVTEDLDIVSRGRTFPHGISKETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT 16MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGRFEDSGAVA SGM-0010.153TANYEARKFGVKAGIPIVEAKKILPNAVYLPARVSVYSAVSTRI M76A_K78V_E79S_Q82S_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI Q83A_S86TLEKEKITVTVGISKNKVFAKIAADMAKPNGIKVIDDEEVKRLIRELDIADVPGIGNITAEKLKKLGINKLVDTLSIEFDKLKGMIGEAKAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPYLFRAIEESYYKLDKRIPKAIHVVAVTEDLDIVSRGRTFPHGISKETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT 17MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGRFEDSGAVA SGM-0010.176TANYEARKFGVKAGIPIVEAKKILPNAVYLPSRTVVYDKVSGRI M76S_K78T_E79V_Q82D_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI Q83K_S86GLEKEKITVTVGISKNKVFAKIAADMAKPNGIKVIDDEEVKRLIRELDIADVPGIGNITAEKLKKLGINKLVDTLSIEFDKLKGMIGEAKAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPYLFRAIEESYYKLDKRIPKAIHVVAVTEDLDIVSRGRTFPHGISKETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT 18MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGRFEDSGAVA SGM-0023.29TANYEARKFGVKAGIPIVEAKKILPNAVYLPSRFAVYNAVSRRI M76S_K78F_E79A_Q82N_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI Q83A_S86RLEKEKITVTVGISKNKVFAKIAADMAKPNGIKVIDDEEVKRLIRELDIADVPGIGNITAEKLKKLGINKLVDTLSIEFDKLKGMIGEAKAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPYLFRAIEESYYKLDKRIPKAIHVVAVTEDLDIVSRGRTFPHGISKETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT 19MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGRFEDSGAVA SGM-0023.61TANYEARKFGVKAGIPIVEAKKILPNAVYLPARASVYKHVSLRI M76A_K78A_E79S_Q82K_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI Q83H_S86LLEKEKITVTVGISKNKVFAKIAADMAKPNGIKVIDDEEVKRLIRELDIADVPGIGNITAEKLKKLGINKLVDTLSIEFDKLKGMIGEAKAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPYLFRAIEESYYKLDKRIPKAIHVVAVTEDLDIVSRGRTFPHGISKETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT 20MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGRFEDSGAVA SGM-0023.75TANYEARKFGVKAGIPIVEAKKILPNAVYLPMRFAVYGDVSARI K78F_E79A_Q82G_Q83D_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI S86ALEKEKITVTVGISKNKVFAKIAADMAKPNGIKVIDDEEVKRLIRELDIADVPGIGNITAEKLKKLGINKLVDTLSIEFDKLKGMIGEAKAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPYLFRAIEESYYKLDKRIPKAIHVVAVTEDLDIVSRGRTFPHGISKETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT 21MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGRFEDSGAVA SGM-0025.47TANYEARKFGVKAGIPIVEAKKILPNAVYLPARIRVYVAVSERI M76A_K78I_E79R_Q82V_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI Q83A_S86ELEKEKITVTVGISKNKVFAKIAADMAKPNGIKVIDDEEVKRLIRELDIADVPGIGNITAEKLKKLGINKLVDTLSIEFDKLKGMIGEAKAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPYLFRAIEESYYKLDKRIPKAIHVVAVTEDLDIVSRGRTFPHGISKETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT 22MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGRFEDSGAVA SGM-0027.26TANYEARKFGVKAGIPIVEAKKILPNAVYLPSRHSVYSMVSTRI M76S_K78H_E79S_Q82S_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI Q83M_S86TLEKEKITVTVGISKNKVFAKIAADMAKPNGIKVIDDEEVKRLIRELDIADVPGIGNITAEKLKKLGINKLVDTLSIEFDKLKGMIGEAKAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPYLFRAIEESYYKLDKRIPKAIHVVAVTEDLDIVSRGRTFPHGISKETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT 23MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGRFEDSGAVA SGM-0027.35TANYEARKFGVKAGIPIVEAKKILPNAVYLPLRYTVYEAVSMRI M76L_K78Y_E79T_Q82E_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI Q83A_S86MLEKEKITVTVGISKNKVFAKIAADMAKPNGIKVIDDEEVKRLIRELDIADVPGIGNITAEKLKKLGINKLVDTLSIEFDKLKGMIGEAKAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPYLFRAIEESYYKLDKRIPKAIHVVAVTEDLDIVSRGRTFPHGISKETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT 24MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGRFEDSGAVA SGM-0027.38TANYEARKFGVKAGIPIVEAKKILPNAVYLPLRYSVYWSVSERI M76L_K78Y_E79S_Q82W_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI Q83S_S86ELEKEKITVTVGISKNKVFAKIAADMAKPNGIKVIDDEEVKRLIRELDIADVPGIGNITAEKLKKLGINKLVDTLSIEFDKLKGMIGEAKAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPYLFRAIEESYYKLDKRIPKAIHVVAVTEDLDIVSRGRTFPHGISKETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT 25MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGRFEDSGAVA SGM-0027.45TANYEARKFGVKAGIPIVEAKKILPNAVYLPFRPVVYDRVSERI M76F_K78P_E79V_Q82D_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI Q83R_S86ELEKEKITVTVGISKNKVFAKIAADMAKPNGIKVIDDEEVKRLIRELDIADVPGIGNITAEKLKKLGINKLVDTLSIEFDKLKGMIGEAKAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPYLFRAIEESYYKLDKRIPKAIHVVAVTEDLDIVSRGRTFPHGISKETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT 26MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGRFEDSGAVA SGM-0027.64TANYEARKFGVKAGIPIVEAKKILPNAVYLPVRQLVYEAVSGRI M76V_K78Q_E79L_Q82E_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI Q83A_S86GLEKEKITVTVGISKNKVFAKIAADMAKPNGIKVIDDEEVKRLIRELDIADVPGIGNITAEKLKKLGINKLVDTLSIEFDKLKGMIGEAKAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPYLFRAIEESYYKLDKRIPKAIHVVAVTEDLDIVSRGRTFPHGISKETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT 27MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGRFEDSGAVA SGM-0029.25TANYEARKFGVKAGIPIVEAKKILPNAVYLPWRIRVYEQVSMRI M76W_K78I_E79R_Q82E_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI Q83Q_S86MLEKEKITVTVGISKNKVFAKIAADMAKPNGIKVIDDEEVKRLIRELDIADVPGIGNITAEKLKKLGINKLVDTLSIEFDKLKGMIGEAKAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPYLFRAIEESYYKLDKRIPKAIHVVAVTEDLDIVSRGRTFPHGISKETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT 28MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGRFEDSGAVA SGM-0029.45TANYEARKFGVKAGIPIVEAKKILPNAVYLPVRFPVYEGVSGRI M76V_K78F_E79P_Q82E_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI Q83G_S86GLEKEKITVTVGISKNKVFAKIAADMAKPNGIKVIDDEEVKRLIRELDIADVPGIGNITAEKLKKLGINKLVDTLSIEFDKLKGMIGEAKAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPYLFRAIEESYYKLDKRIPKAIHVVAVTEDLDIVSRGRTFPHGISKETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT 29MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGRFEDSGAVA SGM-0029.87TANYEARKFGVKAGIPIVEAKKILPNAVYLPARGLVYWQVSSRI M76A_K78G_E79L_Q82W_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI Q83Q_S86SLEKEKITVTVGISKNKVFAKIAADMAKPNGIKVIDDEEVKRLIRELDIADVPGIGNITAEKLKKLGINKLVDTLSIEFDKLKGMIGEAKAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPYLFRAIEESYYKLDKRIPKAIHVVAVTEDLDIVSRGRTFPHGISKETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT 30MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGRFEDSGAVA SGM-0031.16TANYEARKFGVKAGIPIVEAKKILPNAVYLPARIDVYDSVSNRI M76A_K78I_E79D_Q82D_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI Q83S_S86NLEKEKITVTVGISKNKVFAKIAADMAKPNGIKVIDDEEVKRLIRELDIADVPGIGNITAEKLKKLGINKLVDTLSIEFDKLKGMIGEAKAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPYLFRAIEESYYKLDKRIPKAIHVVAVTEDLDIVSRGRTFPHGISKETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT 31MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGRFEDSGAVA SGM-31.16TANYEARKFGVKAGIPIVEAKKILPNAVYLPWRIDVYDSVSNRI M76W_K78I_E79D_Q82D_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI Q83S_S86NLEKEKITVTVGISKNKVFAKIAADMAKPNGIKVIDDEEVKRLIRELDIADVPGIGNITAEKLKKLGINKLVDTLSIEFDKLKGMIGEAKAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPYLFRAIEESYYKLDKRIPKAIHVVAVTEDLDIVSRGRTFPHGISKETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT 32MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGRFEDSGAVA SGM-0031.33TANYEARKFGVKAGIPIVEAKKILPNAVYLPARIDVYDSVSKRI M76A_K78I_E79D_Q82D_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI Q83S_S86KLEKEKITVTVGISKNKVFAKIAADMAKPNGIKVIDDEEVKRLIRELDIADVPGIGNITAEKLKKLGINKLVDTLSIEFDKLKGMIGEAKAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPYLFRAIEESYYKLDKRIPKAIHVVAVTEDLDIVSRGRTFPHGISKETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT 33MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGRFEDSGAVA SGM-31.33TANYEARKFGVKAGIPIVEAKKILPNAVYLPWRIDVYDSVSKRI M76W_K78I_E79D_Q82D_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI Q83S_S86KLEKEKITVTVGISKNKVFAKIAADMAKPNGIKVIDDEEVKRLIRELDIADVPGIGNITAEKLKKLGINKLVDTLSIEFDKLKGMIGEAKAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPYLFRAIEESYYKLDKRIPKAIHVVAVTEDLDIVSRGRTFPHGISKETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT 34MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGRFEDSGAVA SGM-0031.76TANYEARKFGVKAGIPIVEAKKILPNAVYLPSRTLVYYMVSERI M76S_K78T_E79L_Q82Y_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI Q83M_S86ELEKEKITVTVGISKNKVFAKIAADMAKPNGIKVIDDEEVKRLIRELDIADVPGIGNITAEKLKKLGINKLVDTLSIEFDKLKGMIGEAKAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPYLFRAIEESYYKLDKRIPKAIHVVAVTEDLDIVSRGRTFPHGISKETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT 35MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGRFEDSGAVA SGM-0033.35TANYEARKFGVKAGIPIVEAKKILPNAVYLPSRSAVYEKVSGRI M76S_K78S_E79A_Q82E_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI Q83K_S86GLEKEKITVTVGISKNKVFAKIAADMAKPNGIKVIDDEEVKRLIRELDIADVPGIGNITAEKLKKLGINKLVDTLSIEFDKLKGMIGEAKAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPYLFRAIEESYYKLDKRIPKAIHVVAVTEDLDIVSRGRTFPHGISKETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT 36MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGRFEDSGAVA SGM-0033.61TANYEARKFGVKAGIPIVEAKKILPNAVYLPHRPLVYYGVSERI M76H_K78P_E79L_Q82Y_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI Q83G_S86ELEKEKITVTVGISKNKVFAKIAADMAKPNGIKVIDDEEVKRLIRELDIADVPGIGNITAEKLKKLGINKLVDTLSIEFDKLKGMIGEAKAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPYLFRAIEESYYKLDKRIPKAIHVVAVTEDLDIVSRGRTFPHGISKETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT 37MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGRFEDSGAVA SGM-0034.67TANYEARKFGVKAGIPIVEAKKILPNAVYLPSRTFVYEKVSWRI M76S_K78T_E79F_Q82E_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI Q83K_S86WLEKEKITVTVGISKNKVFAKIAADMAKPNGIKVIDDEEVKRLIRELDIADVPGIGNITAEKLKKLGINKLVDTLSIEFDKLKGMIGEAKAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPYLFRAIEESYYKLDKRIPKAIHVVAVTEDLDIVSRGRTFPHGISKETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT 38MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGRFEDSGAVA SGM-0035.78TANYEARKFGVKAGIPIVEAKKILPNAVYLPARILVYSGVSARI M76A_K78I_E79L_Q82S_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI Q83G_S86ALEKEKITVTVGISKNKVFAKIAADMAKPNGIKVIDDEEVKRLIRELDIADVPGIGNITAEKLKKLGINKLVDTLSIEFDKLKGMIGEAKAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPYLFRAIEESYYKLDKRIPKAIHVVAVTEDLDIVSRGRTFPHGISKETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT 39MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGRFEDSGAVA SGM-0036.69TANYEARKFGVKAGIPIVEAKKILPNAVYLPARTEVYYQVSKRI M76A_K78T_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI Q82Y_Q83Q_S86KLEKEKITVTVGISKNKVFAKIAADMAKPNGIKVIDDEEVKRLIRELDIADVPGIGNITAEKLKKLGINKLVDTLSIEFDKLKGMIGEAKAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPYLFRAIEESYYKLDKRIPKAIHVVAVTEDLDIVSRGRTFPHGISKETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT 40MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGRFEDSGAVA SGM-0037.07TANYEARKFGVKAGIPIVEAKKILPNAVYLPARLPVYTTVSTRI M76A_K78L_E79P_Q82T_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI Q83T_S86TLEKEKITVTVGISKNKVFAKIAADMAKPNGIKVIDDEEVKRLIRELDIADVPGIGNITAEKLKKLGINKLVDTLSIEFDKLKGMIGEAKAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPYLFRAIEESYYKLDKRIPKAIHVVAVTEDLDIVSRGRTFPHGISKETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT 41MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGRFEDSGAVA SGM-0037.53TANYEARKFGVKAGIPIVEAKKILPNAVYLPSRNLVYWSVSDRI M76S_K78N_E79L_Q82W_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI Q83S_S86DLEKEKITVTVGISKNKVFAKIAADMAKPNGIKVIDDEEVKRLIRELDIADVPGIGNITAEKLKKLGINKLVDTLSIEFDKLKGMIGEAKAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPYLFRAIEESYYKLDKRIPKAIHVVAVTEDLDIVSRGRTFPHGISKETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT 42MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGRFEDSGAVA SGM-0037.65TANYEARKFGVKAGIPIVEAKKILPNAVYLPARLLVYDHVSMRI M76A_K78L_E79L_Q82D_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI Q83H_S86MLEKEKITVTVGISKNKVFAKIAADMAKPNGIKVIDDEEVKRLIRELDIADVPGIGNITAEKLKKLGINKLVDTLSIEFDKLKGMIGEAKAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPYLFRAIEESYYKLDKRIPKAIHVVAVTEDLDIVSRGRTFPHGISKETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT 43MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGRFEDSGAVA SGM-0038.06TANYEARKFGVKAGIPIVEAKKILPNAVYLPQRFSVYDEVSGRI M76Q_K78F_E79S_Q82D_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI Q83E_S86GLEKEKITVTVGISKNKVFAKIAADMAKPNGIKVIDDEEVKRLIRELDIADVPGIGNITAEKLKKLGINKLVDTLSIEFDKLKGMIGEAKAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPYLFRAIEESYYKLDKRIPKAIHVVAVTEDLDIVSRGRTFPHGISKETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT 44MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGRFEDSGAVA SGM-0057.37TANYEARKFGVKAGIPIVEAKKILPNAVYLPWRPLVYYGVSERI M76W_K78P_E79L_Q82Y_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI Q83G_S86ELEKEKITVTVGISKNKVFAKIAADMAKPNGIKVIDDEEVKRLIRELDIADVPGIGNITAEKLKKLGINKLVDTLSIEFDKLKGMIGEAKAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPYLFRAIEESYYKLDKRIPKAIHVVAVTEDLDIVSRGRTFPHGISKETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT 45MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGRFEDSGAVA SGM-MOTHRATANYEARKFGVKAGIPIVEAKKILPNAVYLPWRNLVYWSVSDRI M76W_K78N_E79L_Q82W_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI Q83S_S86DLEKEKITVTVGISKNKVFAKIAADMAKPNGIKVIDDEEVKRLIRELDIADVPGIGNITAEKLKKLGINKLVDTLSIEFDKLKGMIGEAKAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPYLFRAIEESYYKLDKRIPKAIHVVAVTEDLDIVSRGRTFPHGISKETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT 46MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGRFEDSGAVA SGM-71.85TANYEARKFGVKAGIPIVEAKKILPNAVYLPWRPLVYWSVSDRI M76W_K78P_E79L_Q82W_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI Q83S_S86DLEKEKITVTVGISKNKVFAKIAADMAKPNGIKVIDDEEVKRLIRELDIADVPGIGNITAEKLKKLGINKLVDTLSIEFDKLKGMIGEAKAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPYLFRAIEESYYKLDKRIPKAIHVVAVTEDLDIVSRGRTFPHGISKETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT

TABLE 3 DPO4 Variants Identified through Semi-Rational Design SEQ ID NOAmino Acid Sequence   1 MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGRFEDSGAVAwt DPO4 DNA polymerase TANYEARKFGVKAGIPIVEAKKILPNAVYLPMRKEVYQQVSSRIMNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKILEKEKITVTVGISKNKVFAKIAADMAKPNGIKVIDDEEVKRLIRELDIADVPGIGNITAEKLKKLGINKLVDTLSIEFDKLKGMIGEAKAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPYLFRAIEESYYKLDKRIPKAIHVVAVTEDLDIVSRGRTFPHGISKETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT   47MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGRFEDSGVVA PDC47TANYEARKFGVKAGIPIVEAKKILPNAVYLPWRNLVYWGVSERI A42V_M76W_K78N_E79L_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI Q82W_Q83G_S86E_T141S_LEKEKITVSVGISKNKVLAKFAVDMAKPNGIKVIDDEEVKRLIR F150L_I153F_A155V_I217VELDIADVPGIGNITAEKLKKLGINKLVDTLSIEFDKLKGMVGEAKAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPYLFRAIEESYYKLDKRIPKAIHVVAVTEDLDIVSRGRTFPHGISKETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT  48MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGRFEDSGVVA PDC48TANYEARKFGVKAGIPIVEAKKILPNAVYLPWRNLVYWGVSERI A42V_M76W_K78N_E79L_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI Q82W_Q83G_S86E_T141S_LEKEKITVSVGISKNKVLAGFAVYMAKPNGIKVIDDEEVKRLIR F150L_K152G_I153F_A155V_ELDIADVPGIGNITAEKLKKLGINKLVDTLSIEFDKLKGMVGEA D156Y_I217V_I226F_KAKYLFSLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPY V289W_T290K_E291S_LFRAIEESYYKLDKRIPKAIHVVAWKSYWDIVSRGRTFPHGISK D292Y_L293W_D326EETAYSESVKLLQKILEEEERKIRRIGVRFSKFIEAIGLDKFFDT  49MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGRFEDSGVVA PDC49TANYEARKFGVKAGIPIVEAKKILPNAVYLPWRNLVYWGVSERI A42V_M76W_K78N_E79L_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI Q82W_Q83G_S86E_T141S_LEKEKITVSVGISKNKVLAKFAVWMAKPNGIKVIDDEEVKRLIR F150L_I153F_A155V_ELDIADVPGIGNITAEKLKKLGINKLVDTLSIEFDKLKGMVGEA D156W_I217V_I226F_KAKYLFSLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPY V289W_T290K_E291S_LFRAIEESYYKLDKRIPKAIHVVAWKSYWDIVSRGRTFPHGISK D292Y_L293W_D326EETAYSESVKLLQKILEEEERKIRRIGVRFSKFIEAIGLDKFFDT  50MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGRFEDSGVVA PDC50TANYEARKFGVKAGIPIREAKKILPNAVYLPWRNLVYWGVSERI A42V_V62R_M76W_K78N_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI E79L_Q82W_Q83G_S86E_LEKEKITVSVGISKNKVLAKFAVYMAKPNGIKVIDDEEVKRLIR T141S_F150L_I153F_A155V_ELDIADVPGIGNITAEKLKKLGINKLVDTLSIEFDKLKGMVGEA D156Y_I217V_I226F_V289W_KAKYLFSLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPY T290K_E291S_D292Y LFRAIEESYYKLDKRIPKAIHVVAWKSYWDIVSRGRTFPHGISK L293W_D326EETAYSESVKLLQKILEEEERKIRRIGVRFSKFIEAIGLDKFFDT  51MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGRFEDSGVVA PDC51TANYEARKFGVKAGIPIREAKKILPNAVYLPWRNLVYWGVSERI A42V_V62R_M76W_K78N_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI E79L_Q82W_Q83G_S86E_LEKEKITVSVGISKNKVLAGFAVWMAKPNGIKVIDDEEVKRLIR T141S_F150L_K152G_I153F_ELDIADVPGIGNITAEKLKKLGINKLVDTLSIEFDKLKGMVGEA A155V_D156W_I217V_I226F_KAKYLFSLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPY V289W_T290K_E291S_LFRAIEESYYKLDKRIPKAIHVVAWKSYWDIVSRGRTFPHGISK D292Y_L293W_D326EETAYSESVKLLQKILEEEERKIRRIGVRFSKFIEAIGLDKFFDT  52MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGRFEDSGAVA PDC52TANYEARKFGVKAGIPIVEAKKILPNAVYLPWRNLVYWGVSERI M76W_K78N_E79L_Q82W_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI Q83G_S86E_V289W_T290K_LEKEKITVTVGISKNKVFAKIAADMAKPNGIKVIDDEEVKRLIR E291S_D292Y_L293WELDIADVPGIGNITAEKLKKLGINKLVDTLSIEFDKLKGMIGEAKAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPYLFRAIEESYYKLDKRIPKAIHVVAWKSYWDIVSRGRTFPHGISKETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT  53MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGREEDSGAVA PDC53TANYEARKFGVKAGIPIVEAKKILPNAVYLPWRNLVYWGVSERI M76W_K78N_E79L_Q82W_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI Q83G_S86E_T290K_E291S_LEKEKITVTVGISKNKVFAKIAADMAKPNGIKVIDDEEVKRLIR D292Y_L293WELDIADVPGIGNITAEKLKKLGINKLVDTLSIEFDKLKGMIGEAKAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPYLFRAIEESYYKLDKRIPKAIHVVAVKSYWDIVSRGRTFPHGISKETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT  54MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGREEDSGAVA PDC54TANYEARKFGVKAGIPIVEAKKILPNAVYLPWRNLVYWGVSERI M76W_K78N_E79L_Q82W_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI Q83G_S86E_V289W_E291S_LEKEKITVTVGISKNKVFAKIAADMAKPNGIKVIDDEEVKRLIR D292Y_L293WELDIADVPGIGNITAEKLKKLGINKLVDTLSIEFDKLKGMIGEAKAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPYLFRAIEESYYKLDKRIPKAIHVVAWTSYWDIVSRGRTFPHGISKETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT  55MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGREEDSGAVA PDC55TANYEARKFGVKAGIPIVEAKKILPNAVYLPWRNLVYWGVSERI M76W_K78N_E79L_Q82W_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI Q83G_S86E_V289W_T290K_LEKEKITVTVGISKNKVFAKIAADMAKPNGIKVIDDEEVKRLIR D292Y_L293WELDIADVPGIGNITAEKLKKLGINKLVDTLSIEFDKLKGMIGEAKAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPYLFRAIEESYYKLDKRIPKAIHVVAWKEYWDIVSRGRTFPHGISKETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT  56MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGREEDSGAVA PDC56TANYEARKFGVKAGIPIVEAKKILPNAVYLPWRNLVYWGVSERI M76W_K78N_E79L_Q82W_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI Q83G_S86E_V289W_T290K_LEKEKITVTVGISKNKVFAKIAADMAKPNGIKVIDDEEVKRLIR E291S_L293WELDIADVPGIGNITAEKLKKLGINKLVDTLSIEFDKLKGMIGEAKAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPYLFRAIEESYYKLDKRIPKAIHVVAWKSDWDIVSRGRTFPHGISKETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT  57MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGREEDSGAVA PDC57TANYEARKFGVKAGIPIVEAKKILPNAVYLPWRNLVYWGVSERI M76W_K78N_E79L_Q82W_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI Q83G_S86E_V289W_T290K_LEKEKITVTVGISKNKVFAKIAADMAKPNGIKVIDDEEVKRLIR L293WELDIADVPGIGNITAEKLKKLGINKLVDTLSIEFDKLKGMIGEAKAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPYLFRAIEESYYKLDKRIPKAIHVVAWKEDWDIVSRGRTFPHGISKETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT  58MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGREEDSGAVA PDC58TANYEARKFGVKAGIPIVEAKKILPNAVYLPWRNLVYWGVSERI M76W_K78N_E79L_Q82W_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI Q83G_S86E_V289W_E291S_LEKEKITVTVGISKNKVFAKIAADMAKPNGIKVIDDEEVKRLIR L293WELDIADVPGIGNITAEKLKKLGINKLVDTLSIEFDKLKGMIGEAKAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPYLFRAIEESYYKLDKRIPKAIHVVAWTSDWDIVSRGRTFPHGISKETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT  59MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGREEDSGAVA PDC59TANYEARKFGVKAGIPIVEAKKILPNAVYLPWRNLVYWGVSERI M76W_K78N_E79L_Q82W_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI Q83G_S86E_V289W_D292Y_LEKEKITVTVGISKNKVFAKIAADMAKPNGIKVIDDEEVKRLIR L293WELDIADVPGIGNITAEKLKKLGINKLVDTLSIEFDKLKGMIGEAKAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPYLFRAIEESYYKLDKRIPKAIHVVAWTEYWDIVSRGRTFPHGISKETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT  60MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGREEDSGVVA PDC60TANYEARKFGVKAGIPIVEAKKILPNAVYLPWRNLVYWGVSERI A42V_M76W_K78N_E79L_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI Q82W_Q83G_S86E_T141S_LEKEKITVSVGISKNKVLAKFAVYMAKPNGIKVIDDEEVKRLIR F150L_I153F_A155V_D156Y_ELDIADVPGIGNITAEKLKKLGINKLVDTLSIEFDKLKGMVGEA I217V_I226F_V289W_KAKYLFSLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPY T290K_E291S_D292Y_PLFRAIEESYYKLDKRIKAIHVVAWKSYWDIVSRGRTFPHGISK L293WETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT  61MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGREEDSGAVA PDC61TANYEARKFGVKAGIPIVEAKKILPNAVYLPWRNLVYWGVSERI M76W_K78N_E79L_Q82W_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI Q83G_S86E_T141S_F150L_LEKEKITVSVGISKNKVLAKFAVYMAKPNGIKVIDDEEVKRLIR I153F_A155V_D156Y_ELDIADVPGIGNITAEKLKKLGINKLVDTLSIEFDKLKGMVGEA I217V_I226F_V289W_T290K_KAKYLFSLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPY E291S_D292Y_L293WLFRAIEESYYKLDKRIPKAIHVVAWKSYWDIVSRGRTFPHGISKETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT  62MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGREEDSGAVA PDC62TANYEARKFGVKAGIPIVEAKKILPNAVYLPWRNLVYWGVSERI M76W_K78N_E79L_Q82W_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI Q83G_S86E_K152G_D156W_LEKEKITVTVGISKNKVFAGIAAWMAKPNGIKVIDDEEVKRLIR V289W_T290K_E291S_ELDIADVPGIGNITAEKLKKLGINKLVDTLSIEFDKLKGMIGEA D292Y_L293WKAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPYLFRAIEESYYKLDKRIPKAIHVVAWKSYWDIVSRGRTFPHGISKETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT  63MIVLYVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGREEDSGVVA PDC63TANYEARKFGVKAGIPIVEAKKILPNAVYLPWRNLVYWGVSERI F5Y_A42V_M76W_K78N_E79L_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI Q82W_Q83G_S86E_LEKEKITVSVGISKNKVFAGFAAWMAKPNGIKVIDDEEVKRLIR T141S_K152G_I153F_ELDIADVPGIGNLTAEKLKKLGINKLVDTLSIEFDKLKGMVGEA D156W_I189L_I217V_I226F_KAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPY V289W_T290K_E291S LFRAIEESYYKLDKRIPKAIHVVAWKSYWDIVSRGRTFPHGISK D292Y_L293WETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT  64MIVLYVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGREEDSGVVA PDC64TANYEARKFGVKAGIPIVEAKKILPNAVYLPWRNLVYWGVSERI F5Y_A42V_M76W_K78N_E79L_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI Q82W_Q83G_S86E_LEKEKITVSVGISKNKVLAGFAAWMAKPNGIKVIDDEEVKRLIR T141S_F150L_K152G_I153F_ELDIADVPGIGNLTAEKLKKLGINKLVDTLSIEFDKLKGMVGEA D156W_I189L_I217V_KAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPY I226F_V289W_T290K_LFRAIEESYYKLDKRIPKAIHVVAWKSYWDIVSRGRTFPHGISK E291S_D292Y_L293WETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT  65MIVLYVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGREEDSGVVA PDC65TANYEARKFGVKAGIPIVEAKKILPNAVYLPWRNLVYWGVSERI F5Y_A42V_M76W_K78N_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI E79L_Q82W_Q83G_S86E_LEKEKITVSVGISKNKVFAGIAAWMAKPNGIKVIDDEEVKRLIR T141S_K152G_D156W_ELDIADVPGIGNLTAEKLKKLGINKLVDTLSIEFDKLKGMVGEA I189L_I217V_I226F_V289W_KAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPY T290K_E291S_D292Y_LFRAIEESYYKLDKRIPKAIHVVAWKSYWDIVSRGRTFPHGISK L293WETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT  66MIVLYVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGREEDSGVVA PDC66TANYEARKFGVKAGIPIVEAKKILPNAVYLPWRNLVYWGVSERI F5Y_A42V_M76W_K78N_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI E79L_Q82W_Q83G_S86E_LEKEKITVSVGISKNKVLAGIAAWMAKPNGIKVIDDEEVKRLIR T141S_F150L_K152G_ELDIADVPGIGNLTAEKLKKLGINKLVDTLSIEFDKLKGMVGEA D156W_I189L_I217V_I226F_KAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPY V289W_T290K_E291S_LFRAIEESYYKLDKRIPKAIHVVAWKSYWDIVSRGRTFPHGISK D292Y_L293WETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT  67MIVLYVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGREEDSGVVA PDC67TANYEARKFGVKAGIPIVEAKKILPNAVYLPWRNLVYWGVSERI F5Y_A42V_M76W_K78N_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI E79L_Q82W_Q83G_S86E_LEKEKITVSVGISKNKVFAKFAAWMAKPNGIKVIDDEEVKRLIR T141S_I153F_D156W_I189L_ELDIADVPGIGNLTAEKLKKLGINKLVDTLSIEFDKLKGMVGEA I217V_I226F_V289W_KAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPY T290K_E291S_D292Y_LFRAIEESYYKLDKRIPKAIHVVAWKSYWDIVSRGRTFPHGISK L293WETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT  68MIVLYVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGREEDSGVVA PDC68TANYEARKFGVKAGIPIVEAKKILPNAVYLPWRNLVYWGVSERI F5Y_A42V_M76W_K78N_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI E79L_Q82W_Q83G_S86E_LEKEKITVSVGISKNKVLAKFAAWMAKPNGIKVIDDEEVKRLIR T141S_F150L_I153F_D156W ELDIADVPGIGNLTAEKLKKLGINKLVDTLSIEFDKLKGMVGEA I189L_I217V_I226F_KAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPY V289W_I290K_E291S_LFRAIEESYYKLDKRIPKAIHVVAWKSYWDIVSRGRTFPHGISK D292Y_L293WETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT  69MIVLYVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGREEDSGVVA PDC69TANYEARKFGVKAGIPIVEAKKILPNAVYLPWRNLVYWGVSERI F5Y_A42V_M76W_K78N_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI E79L_Q82W_Q83G_S86E_LEKEKITVSVGISKNKVFAKIAAWMAKPNGIKVIDDEEVKRLIR T141S_D156W_I189L_I217V_ELDIADVPGIGNLTAEKLKKLGINKLVDTLSIEFDKLKGMVGEA I226F_V289W_T290K_KAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPY E291S_D292Y_L293WLFRAIEESYYKLDKRIPKAIHVVAWKSYWDIVSRGRTFPHGISKETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT  70MIVLYVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGREEDSGVVA PDC70TANYEARKFGVKAGIPIVEAKKILPNAVYLPWRNLVYWGVSERI F5Y_A42V_M76W_K78N_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI E79L_Q82W_Q83G_S86E_LEKEKITVSVGISKNKVLAKIAAWMAKPNGIKVIDDEEVKRLIR T141S_F150L_D156W_I189L_ELDIADVPGIGNLTAEKLKKLGINKLVDTLSIEFDKLKGMVGEA I217V_I226F_V289W_KAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPY T290K_E291S_D292Y_LFRAIEESYYKLDKRIPKAIHVVAWKSYWDIVSRGRTFPHGISK L293WETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT  71MIVLYVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGREEDSGVVA PDC71TANYEARKFGVKAGIPIREAKKRLPNAVYLPWRNLVYWGVSETD F5Y_A42V_V62R_I67R_WNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI M76W_K78N_E79L_Q82W_LEKEKITVSVGISKNKVFAGIAAWMAKPNGIKVIDDEEVKRLIR Q83G_S86E_R87T_I88D_ELDIADVLGIGDGTAEKLKKLGINKLVDTLSIEFDKLKGMVGEA M89W_T141S_K152G_D156W_KAKYLISLARDEYNEPIRTRVIKSIGRIVTMKRNSRNLEEIKPY P184L_N188D_I189G_LFRAIEESYYKLDKRIPKAIHVVAWKSYWDIVSRGRTFPHGISK I217V_R242I_V289W_ETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT T290K_E291S_D292Y_ L293W 72 MIVLYVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGREEDSGVVA PDC72TANYEARKFGVKAGIPIREAKKLLPNAVYLPWRNLVYWGVSETD F5Y_A42V_V62R_I67L_M76W_WNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI K78N_E79L_Q82W_LEKEKITVSVGISKNKVFAGIAAWMAKPNGIKVIDDEEVKRLIR Q83G_S86E_R87T_I88D_ELDIADVLGIGDGTAEKLKKLGINKLVDTLSIEFDKLKGMVGEA M89W_T141S_K152G_D156W_KAKYLISLARDEYNEPIRTRVIKSIGRIVTMKRNSRNLEEIKPY P184L_N188D_I189G_LFRAIEESYYKLDKRIPKAIHVVAWKSYWDIVSRGRTFPHGISK I217V_R242I_V289W_ETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT T290K_E291S_D292Y_ L293W 73 MIVLYVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGREEDSGVVA PDC73TANYEARKFGVKAGIPIREAKKRLPNAVYLPWRNLVYWGVSETI F5Y_A42V_V62R_I67R_WNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI M76W_K78N_E79L_Q82W_LEKEKITVSVGISKNKVFAGIAAWMAKPNGIKVIDDEEVKRLIR Q83G_S86E_R87T_M89W_ELDIADVLGIGGETAEKLKKLGINKLVDTLSIEFDKLKGMVGEA T141S_K152G_D156W_KAKYLISLARDEYNEPIRTRVIKSIGRIVTMKRNSRNLEEIKPY P184L_N188G_I189E_I217V_LFRAIEESYYKLDKRIPKAIHVVAWKSYWDIVSRGRTFPHGISK R242I_V289W_T290K_ETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT E291S_D292Y_L293W  74MIVLYVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGREEDSGVVA PDC74TANYEARKFGVKAGIPIREAKKLLPNAVYLPWRNLVYWGVSETI F5Y_A42V_V62R_I67L_WNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI M76W_K78N_E79L_Q82W_LEKEKITVSVGISKNKVFAGIAAWMAKPNGIKVIDDEEVKRLIR Q83G_S86E_R87T_M89W_ELDIADVLGIGGETAEKLKKLGINKLVDTLSIEFDKLKGMVGEA T141S_K152G_D156W_KAKYLISLARDEYNEPIRTRVIKSIGRIVTMKRNSRNLEEIKPY P184L_N188G_I189E_I217V_LFRAIEESYYKLDKRIPKAIHVVAWKSYWDIVSRGRTFPHGISK R242I_V289W_T290K_ETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT E291S_D292Y_L293W  75MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGREEDSGVVA PDC75TANYEARKFGVKAGIPIVEAKKILPNAVYLPWRNLVYWGVSERI A42V_M76W_K78N_E79L_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI Q82W_Q83G_S86E_K152G_LEKEKITVTVGISKNKVFAGIAAWMAKPNGIKVIDDEEVKRLIR D156W_V289W_T290K_ELDIADVPGIGNITAEKLKKLGINKLVDTLSIEFDKLKGMIGEA E291S_D292Y_L293WKAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPYLFRAIEESYYKLDKRIPKAIHVVAWKSYWDIVSRGRTFPHGISKETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT  76MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGREEDSGVVA PDC77TANYEARKFGVKAGIPIVEAKKILPNAVYLPWRNLVYWGVSERI A42V_M76W_K78N_E79L_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI Q82W_Q83G_S86E_V289W_LEKEKITVTVGISKNKVFAKIAADMAKPNGIKVIDDEEVKRLIR T290K_E291S_D292Y_ELDIADVPGIGNITAEKLKKLGINKLVDTLSIEFDKLKGMIGEA L293WKAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPYLFRAIEESYYKLDKRIPKAIHVVAWKSYWDIVSRGRTFPHGISKETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT  77MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGREEDSGVVA PDC78TANYEARKFGVKAGIPIVEAKKILPNAVYLPWRNLVYWGVSERI A42V_M76W_K78N_E79L_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI Q82W_Q83G_S86E_K152G_LEKEKITVTVGISKNKVFAGIAAWMAKPNGIKVIDDEEVKRLIR D156W_P184L_V289W_ELDIADVLGIGNITAEKLKKLGINKLVDTLSIEFDKLKGMIGEA T290K_E291S_D292Y_KAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPY L293WLFRAIEESYYKLDKRIPKAIHVVAWKSYWDIVSRGRTFPHGISKETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT  78MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGREEDSGVVA PDC79TANYEARKFGVKAGIPIVEAKKILPNAVYLPWRNLVYWGVSERI A42V_M76W_K78N_E79L_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI Q82W_Q83G_S86E_K152G_LEKEKITVTVGISKNKVFAGIAAWMAKPNGIKVIDDEEVKRLIR D156W_P184L_I189W_ELDIADVLGIGNWTAEKLKKLGINKLVDTLSIEFDKLKGMIGEA V289W_T290K_E291S_KAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPY D292Y_L293WLFRAIEESYYKLDKRIPKAIHVVAWKSYWDIVSRGRTFPHGISKETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT  79MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGREEDSGVVA PDC80TANYEARKFGVKAGIPIVEAKKILPNAVYLPWRNLVYWGVSERI A42V_M76W_K78N_E79L_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI Q82W_Q83G_S86E_K152G_LEKEKITVTVGISKNKVFAGIAAWMAKPNGIKVIDDEEVKRLIR D156W_I189W_V289W_ELDIADVPGIGNWTAEKLKKLGINKLVDTLSIEFDKLKGMIGEA T290K_E291S_D292Y_KAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPY L293WLFRAIEESYYKLDKRIPKAIHVVAWKSYWDIVSRGRTFPHGISKETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT  80MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGREEDSGVVA PDC81TANYEARKFGVKAGIPIVEAKKILPNAVYLPWRNLVYWGVSERI A42V_M76W_K78N_E79L_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI Q82W_Q83G_S86E_K152G_LEKEKITVTVGISKNKVFAGIAAWMAKPNGIKVIDDEEVKRLIR D156W_N188L_V289W_ELDIADVPGIGLITAEKLKKLGINKLVDTLSIEFDKLKGMIGEA T290K_E291S_D292Y_KAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPY L293WLFRAIEESYYKLDKRIPKAIHVVAWKSYWDIVSRGRTFPHGISKETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT  81MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGREEDSGVVA PDC82TANYEARKFGVKAGIPIVEAKKILPNAVYLPWRNLVYWGVSERI A42V_M76W_K78N_E79L_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI Q82W_Q83G_S86E_P184L_LEKEKITVTVGISKNKVFAKIAADMAKPNGIKVIDDEEVKRLIR V289W_T290K_E291S_ELDIADVLGIGNITAEKLKKLGINKLVDTLSIEFDKLKGMIGEA D292Y_L293WKAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPYLFRAIEESYYKLDKRIPKAIHVVAWKSYWDIVSRGRTFPHGISKETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT  82MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGREEDSGVVA PDC83TANYEARKFGVKAGIPIVEAKKILPNAVYLPWRNLVYWGVSERI A42V_M76W_K78N_E79L_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI Q82W_Q83G_S86E_P184L_LEKEKITVTVGISKNKVFAKIAADMAKPNGIKVIDDEEVKRLIR I189W_V289W_T290K_ELDIADVLGIGNWTAEKLKKLGINKLVDTLSIEFDKLKGMIGEA E291S_D292Y_L293WKAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPYLFRAIEESYYKLDKRIPKAIHVVAWKSYWDIVSRGRTFPHGISKETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT  83MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGREEDSGVVA PDC84TANYEARKFGVKAGIPIVEAKKILPNAVYLPWRNLVYWGVSERI A42V_M76W_K78N_E79L_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI Q82W_Q83G_S86E_I189W_LEKEKITVTVGISKNKVFAKIAADMAKPNGIKVIDDEEVKRLIR V289W_T290K_E291S_ELDIADVPGIGNWTAEKLKKLGINKLVDTLSIEFDKLKGMIGEA D292Y_L293WKAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPYLFRAIEESYYKLDKRIPKAIHVVAWKSYWDIVSRGRTFPHGISKETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT  84MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGREEDSGVVA PDC85TANYEARKFGVKAGIPIVEAKKILPNAVYLPWRNLVYWGVSERI A42V_M76W_K78N_E79L_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI Q82W_Q83G_S86E_P184L_LEKEKITVTVGISKNKVFAKIAADMAKPNGIKVIDDEEVKRLIR N188L_V289W_T290K_ELDIADVLGIGLITAEKLKKLGINKLVDTLSIEFDKLKGMIGEA E291S_D292Y_L293WKAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPYLFRAIEESYYKLDKRIPKAIHVVAWKSYWDIVSRGRTFPHGISKETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT  85MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGREEDSGVVA PDC86TANYEARKFGVKAGIPIVEAKKILPNAVYLPWRNLVYWGVSERI A42V_M76W_K78N_E79L_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI Q82W_Q83G_S86E_A155L_LEKEKITVTVGISKNKVFAKIALDMAKPNGIKVIDDEEVKRLIR P184L_I189W_V289W_ELDIADVLGIGNWTAEKLKKLGINKLVDTLSIEFDKLKGMIGEA T290K_E291S_D292Y_KAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPY L293WLFRAIEESYYKLDKRIPKAIHVVAWKSYWDIVSRGRTFPHGISKETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT  86MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGREEDSGVVA PDC87TANYEARKFGVKAGIPIVEAKKILPNAVYLPWRNLVYWGVSERI A42V_M76W_K78N_E79L_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI Q82W_Q83G_S86E_A155M_LEKEKITVTVGISKNKVFAKIAMDMAKPNGIKVIDDEEVKRLIR P184L_I189W_V289W_ELDIADVLGIGNWTAEKLKKLGINKLVDTLSIEFDKLKGMIGEA T290K_E291S_D292Y_KAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPY L293WLFRAIEESYYKLDKRIPKAIHVVAWKSYWDIVSRGRTFPHGISKETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT  87MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGREEDSGVVA PDC88TANYEARKFGVKAGIPIVEAKKILPNAVYLPWRNLVYWGVSERI A42V_M76W_K78N_E79L_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI Q82W_Q83G_S86E_K152G_LEKEKITVTVGISKNKVFAGIALDMAKPNGIKVIDDEEVKRLIR A155L_P184L_I189W_ELDIADVLGIGNWTAEKLKKLGINKLVDTLSIEFDKLKGMIGEA V289W_T290K_E291S_KAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPY D292Y_L293WLFRAIEESYYKLDKRIPKAIHVVAWKSYWDIVSRGRTFPHGISKETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT  88MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGREEDSGVVA PDC89TANYEARKFGVKAGIPIVEAKKILPNAVYLPWRNLVYWGVSERI A42V_M76W_K78N_E79L_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI Q82W_Q83G_S86E_K152G_LEKEKITVTVGISKNKVFAGIAMDMAKPNGIKVIDDEEVKRLIR A155M_P184L_I189W_ELDIADVLGIGNWTAEKLKKLGINKLVDTLSIEFDKLKGMIGEA V289W_T290K_E291S_KAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPY D292Y_L293WLFRAIEESYYKLDKRIPKAIHVVAWKSYWDIVSRGRTFPHGISKETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT  89MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGREEDSGVVA PDC90TANYEARKFGVKAGIPIVEAKKILPNAVYLPWRNLVYWGVSERI A42V_M76W_K78N_E79L_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI Q82W_Q83G_S86E_K152G_LEKEKITVTVGISKNKVFAGIAADMAKPNGIKVIDDEEVKRLIR P184L_I189W_V289W_ELDIADVLGIGNWTAEKLKKLGINKLVDTLSIEFDKLKGMIGEA T290K_E291S_D292Y_KAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPY L293WLFRAIEESYYKLDKRIPKAIHVVAWKSYWDIVSRGRTFPHGISKETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT  90MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGREEDSGVVA PDC91TANYEARKFGVHAGIPIVEAKKILPNAVYLPWRNLVYWGVSERI A42V_K56H_M76W_K78N_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI E79L_Q82W_Q83G_S86E_LEKEKITVTVGISKNKVFAGIAAWMAKPNGIKVIDDEEVKRLIR K152G_D156W_P184L_ELDIADVLGIGNWTAEKLKKLGINKLVDTLSIEFDKLKGMIGEA I189W_V289W_T290K_KAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPY E291S_D292Y_L293W_LFRAIEESYYKLDKRIPKAIHVVAWKSYWDIVSRGETFPHGISK R300EETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT  91MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGREEDSGVVA PDC92TANYEARKFGVKAGIPIVEAKKILPNAVYLPWRNLVYWGVSERI A42V_M76W_K78N_E79L_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI Q82W_Q83G_S86E_K152G_LEKEKITVTVGISKNKVFAGIAAWMAKPNGIKVIDDEEVKRLIR D156W_P184L_I189W_ELDIADVLGIGNWTAEKLKKLGINKLVDTLSIEFDKLKGMIGEA V289W_T290K_E291S_KAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPY D292Y_L293W_R300VLFRAIEESYYKLDKRIPKAIHVVAWKSYWDIVSRGVTFPHGISKETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT  92MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGREEDSGVVA PDC93TANYEARKFGVYAGIPIVEAKKILPNAVYLPWRNLVYWGVSERI A42V_K56Y_M76W_K78N_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI E79L_Q82W_Q83G_S86E_LEKEKITVTVGISKNKVFAGIAAWMAKPNGIKVIDDEEVKRLIR K152G_D156W_P184L_ELDIADVLGIGNWTAEKLKKLGINKLVDTLSIEFDKLKGMIGEA I189W_V289W_T290K_KAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPY E291S_D292Y_L293WLFRAIEESYYKLDKRIPKAIHVVAWKSYWDIVSRGRTFPHGISKETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT  93MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGREEDSGVVA PDC94TANYEARKFGVHAGIPIVEAKKILPNAVYLPWRNLVYWGVSERI A42V_K56H_M76W_K78N_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI E79L_Q82W_Q83G_S86E_LEKEKITVTVGISKNKVFAGIAAWMAKPNGIKVIDDEEVKRLIR K152G_D156W_P184L_ELDIADVLGIGNWTAEKLKKLGINKLVDTLSIEFDKLKGMIGEA I189W_V289W_T290K_KAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPY E291S_D292Y_L293WLFRAIEESYYKLDKRIPKAIHVVAWKSYWDIVSRGRTFPHGISKETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT  94MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGREEDSGVVA PDC95TANYEARKFGVKAGIPIVEAKKILPNAVYLPWRNLVYWGVSERI A42V_M76W_K78N_E79L_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI Q82W_Q83G_S86E_K152G_LEKEKITVTVGISKNKVFAGQAAWMAKPNGIKVIDDEEVKRLIR I153Q_D156W_P184L_ELDIADVLGIGNWTAEKLKKLGINKLVDTLSIEFDKLKGMIGEA I189W_V289W_T290K_KAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPY E291S_D292Y_L293WLFRAIEESYYKLDKRIPKAIHVVAWKSYWDIVSRGRTFPHGISKETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT  95MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGREEDSGVVA PDC96TANYEARKFGVKAGIPIVEAKKILPNAVYLPWRNLVYWGVSERI A42V_M76W_K78N_E79L_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI Q82W_Q83G_S86E_K152G_LEKEKITVTVGISKNKVFAGWAAWMAKPNGIKVIDDEEVKRLIR I153W_D156W_P184L_ELDIADVLGIGNWTAEKLKKLGINKLVDTLSIEFDKLKGMIGEA I189W_V289W_T290K_KAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPY E291S_D292Y_L293WLFRAIEESYYKLDKRIPKAIHVVAWKSYWDIVSRGRTFPHGISKETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT  96MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGREEDSGVVA PDC97TANYEARKFGVKAGIPIVEAKKILPNAVYLPWRNLVYWGVSERI A42V_M76W_K78N_E79L_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI Q82W_Q83G_S86E_K152Q_LEKEKITVTVGISKNKVFAQIAAWMAKPNGIKVIDDEEVKRLIR D156W_P184L_I189W_ELDIADVLGIGNWTAEKLKKLGINKLVDTLSIEFDKLKGMIGEA V289W_T290K_E291S_KAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPY D292Y_L293WLFRAIEESYYKLDKRIPKAIHVVAWKSYWDIVSRGRTFPHGISKETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT  97MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGREEDSGVVA PDC99TANYEARKFGVKAGIPIVEAKKILPNAVYLPWRNLVYWGVSERI A42V_M76W_K78N_E79L_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI Q82W_Q83G_S86E_I153W_LEKEKITVTVGISKNKVFAKWAADMAKPNGIKVIDDEEVKRLIR P184L_I189W_V289W_ELDIADVLGIGNWTAEKLKKLGINKLVDTLSIEFDKLKGMIGEA T290K_E291S_D292Y_KAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPY L293WLFRAIEESYYKLDKRIPKAIHVVAWKSYWDIVSRGRTFPHGISKETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT  98MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGREEDSGVVA PDC100TANYEARKFGVKAGIPIVEAKKILPNAVYLPWRNLVYWGVSERI A42V_M76W_K78N_E79L_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI Q82W_Q83G_S86E_I153Q_LEKEKITVTVGISKNKVFAKQAADMAKPNGIKVIDDEEVKRLIR P184L_I189W_V289W_ELDIADVLGIGNWTAEKLKKLGINKLVDTLSIEFDKLKGMIGEA T290K_E291S_D292Y_KAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPY L293WLFRAIEESYYKLDKRIPKAIHVVAWKSYWDIVSRGRTFPHGISKETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT  99MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGREEDSGVVA PDC101TANYEARKFGVKAGIPIVEAKKILPNAVYLPWRNLVYWGVSERI A42V_M76W_K78N_E79L_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI Q82W_Q83G_S86E_K152Q_LEKEKITVTVGISKNKVFAQIAADMAKPNGIKVIDDEEVKRLIR P184L_I189W_V289W_ELDIADVLGIGNWTAEKLKKLGINKLVDTLSIEFDKLKGMIGEA T290K_E291S_D292Y_KAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPY L293WLFRAIEESYYKLDKRIPKAIHVVAWKSYWDIVSRGRTFPHGISKETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT 100MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGREEDSGVVA PDC102TANYEARKFGVKAGIPIVEAKKILPNAVYLPWRNLVYWGVSERI A42V_M76W_K78N_E79L_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI Q82W_Q83G_S86E_K152G_LEKEKITVTVGISKNKVFAGIALWMAKPNGIKVIDDEEVKRLIR A155L_D156W_P184L_ELDIADVLGIGNWTAEKLKKLGINKLVDTLSIEFDKLKGMIGEA I189W_V289W_T290K_KAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPY E291S_D292Y_L293WLFRAIEESYYKLDKRIPKAIHVVAWKSYWDIVSRGRTFPHGISKETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT 101MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGREEDSGVVA PDC103TANYEARKFGVKAGIPIVEAKKILPNAVYLPWRNLVYWGVSERI A42V_M76W_K78N_E79L_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI Q82W_Q83G_S86E_K152G_LEKEKITVTVGISKNKVFAGIAMWMAKPNGIKVIDDEEVKRLIR A155M_D156W_P184L_ELDIADVLGIGNWTAEKLKKLGINKLVDTLSIEFDKLKGMIGEA I189W_V289W_T290K_KAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPY E291S_D292Y_L293WLFRAIEESYYKLDKRIPKAIHVVAWKSYWDIVSRGRTFPHGISKETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT 102MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGREEDSGVVA PDC104TANYEARKFGVKAGIPIVEAKKILPNAVYLPWRNLVYWGVSERI A42V_M76W_K78N_E79L_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI Q82W_Q83G_S86E_T141S_LEKEKITVSVGISKNKVFAKIAMWMAKPNGIKVIDDEEVKRLIR A155M_D156W_P184L_ELDIADVLGIGNITAEKLKKLGINKLVDTLSIEFDKLKGMVGEA V289W_T290K_E291S_KAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPY D292Y_L293WLFRAIEESYYKLDKRIPKAIHVVAWKSYWDIVSRGRTFPHGISKETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT 103MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGREEDSGVVA PDC105TANYEARKFGVHAGIPIVEAKKILPNAVYLPWRNLVYWGVSERI A42V_K56H_M76W_K78N_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI E79L_Q82W_Q83G_S86E_LEKEKITVTVGISKNKVFAGIAAWMAKPNGIKVIDDEEVKRLIR K152G_D156W_P184L_ELDIADVLGIGNWTAEKLKKLGINKLVDTLSIEFDKLKGMIGEA I189W_V289W_T290K_KAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPY E291S_D292Y_L293W_LFRAIEESYYKLDKRIPKAIHVVAWKSYWDIVSRGVTFPHGISK R300VETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT 104MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGREEDSGVVA PDC106TANYEARKFGVKAGIPIVEAKKILPNAVYLPWRNLVYWGVSERI A42V_M76W_K78N_E79L_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI Q82W_Q83G_S86E_K152G_LEKEKITVTVGISKNKVFAGIAAWMAKPNGIKVIDDEEVKRLIR D156W_P184L_G187W_ELDIADVLGIWNWTAEKLKKLGINKLVDTLSIEFDKLKGMIGEA I189W_V289W_T290K_KAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPY E291S_D292Y_L293WLFRAIEESYYKLDKRIPKAIHVVAWKSYWDIVSRGRTFPHGISKETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT 105MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGREEDSGVVA PDC107TANYEARKFGVKAGIPIVEAKKILPNAVYLPWRNLVYWGVSERI A42V_M76W_K78N_E79L_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI Q82W_Q83G_S86E_K152G_LEKEKITVTVGISKNKVFAGIAAWMAKPNGIKVIDDEEVKRLIR D156W_P184L_G187W_ELDIADVLGIWNITAEKLKKLGINKLVDTLSIEFDKLKGMIGEA V289W_T290K_E2915_KAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPY D292Y_L293WLFRAIEESYYKLDKRIPKAIHVVAWKSYWDIVSRGRTFPHGISKETAYSESVKLLQKILEEDERKIRRIGVRFSKFIEAIGLDKFFDT 106MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGREEDSGVVA PDC108TANYEARKFGVKAGIPIVEAKKILPNAVYLPWRNLVYWGVSERI A42V_M76W_K78N_E79L_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI Q82W_Q83G_S86E_K152G_LEKEKITVTVGISKNKVFAGIAAWMAKPNGIKVIDDEEVKRLIR D156W_P184L_I189W_ELDIADVLGIGNWTAEKLKKLGINKLVDTLSIEFDKLKGMIGEA V289W_T290K_E291S_KAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPY D292Y_L293WΔ341-352LFRAIEESYYKLDKRIPKAIHVVAWKSYWDIVSRGRTFPHGISKETAYSESVKLLQKILEEDERKIRRIGVRFSKF 107MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGREEDSGVVA PDC109TANYEARKFGVKAGIPIVEAKKILPNAVYLPWRNLVYWGVSERI A42V_M76W_K78N_E79L_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI Q82W_Q83G_S86E_K152G_LEKEKITVTVGISKNKVFAGIAAWMAKPNGIKVIDDEEVKRLIR D156W_P184L_I189W_ELDIADVLGIGNWTAEKLKKLGINKLVDTLSIEFDKLKGMIGEA V289W_T290K_E291S_D292Y_KAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPY L293WΔ342-352LFRAIEESYYKLDKRIPKAIHVVAWKSYWDIVSRGRTFPHGISKETAYSESVKLLQKILEEDERKIRRIGVRFSKFI 108MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGREEDSGVVA PDC110TANYEARKFGVKAGIPIVEAKKILPNAVYLPWRNLVYWGVSERI A42V_M76W_K78N_E79L_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI Q82W_Q83G_S86E_K152G_LEKEKITVTVGISKNKVFAGIAAWMAKPNGIKVIDDEEVKRLIR D156W_P184L_I189W_ELDIADVLGIGNWTAEKLKKLGINKLVDTLSIEFDKLKGMIGEA V289W_T290K_E291S_KAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPY D292Y_L293W_LFRAIEESYYKLDKRIPKAIHVVAWKSYWDIVSRGRTFPHGISK I341CΔ343-352ETAYSESVKLLQKILEEDERKIRRIGVRFSKFC 109MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGREEDSGVVA CO345TANYEARKFGVYAGIPIVEAKKILPNAVYLPWRNLVYWGVSERI A42V_K56Y_M76W_K78N_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI E79L_Q82W_Q83G_S86E_LEKEKITVTVGISKNKVFAGIAAWMAKPNGIKVIDDEEVKRLIR K152G_D156W_P184L_ELDIADVLGIGNWTAEKLKKLGINKLVDTLSIEFDKLKGMIGEA I189W_V289W_T290K_KAKYLISLARDEYNEPIRTRVRKSIGRIVTMKRNSRNLEEIKPY E291S_D292Y_LFRAIEESYYKLDKRIPKAIHVVAWKSYWDIVSRGRTFPHGISK L293WΔ341-352ETAYSESVKLLQKILEEDERKIRRIGVRFSKF 110MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGREEDSGVVA CO416TANYEARKFGVYAGIPIVEAKKILPNAVYLPWRNLVYWGVSERI A42V_K56Y_M76W_K78N_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI E79L_Q82W_Q83G_S86E_LEKEKITVTVGISKNKVFAGIAMWMAKPNGIKVIDDEEVKRLIR K152G_A155M_D156W_ELDIADVLGIGNWTAEKLKKLGINKLVDTLSIEFDKLKGMIGEA P184L_I189W_I248T_V289W_KAKYLISLARDEYNEPIRTRVRKSIGRTVIMKRNSRNLEEIKPY T290K_E291S_D292Y_LFRAIEESYYKLDKRIPKAIHVVAWKSYWDIVSRGRTFPHGISK L293WΔ341-352ETAYSESVKLLQKILEEDERKIRRIGVRFSKF 111MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGREEDSGVVA CO681TANYEARKFGVYAGIPIVEAKKILPNAVYLPWRNLVYWGVSERI A42V_K56Y_M76W_K78N_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI E79L_Q82W_Q83G_S86E_LEKEKITVTVGISKNKVFAGIAMWMAKPNGIKVIDDEEVKRLIR K152G_A155M_D156W_ELDIADVLGIGNWTAEKLKKLGINKLVDTLSIEFDKLKGMIGEA P184L_I189W_I248T_KAKYLISLARDEYNEPIRTRVRKSIGRTVIMKRNSRNLEEIKPY V289W_T290R_E291S_LFRAIEESYYKLDKRIPKAIHVVAWRSYWDYVHRLRRFPHGISK D292Y_L293W_I295Y_S297H_ETAYSESVKLLQKILEEDERKIRRIGVRFSKF G299L_T301RΔ341-352 112MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGREEDSGVVA CO935TANYEARKFGVYPGIPIVEAKKILPNAVYLPWRNLVYWGVSERI A42V_K56Y_A57P_M76W_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI K78N_E79L_Q82W_Q83G_LEKEKITVTVGISKNKVFAGIAMWMAKPNGIKVIDDEEVKRLIR S86E_K152G_A155M_ELDIADVLGIGNWTAEKLKKLGINKLVDTLSIEFDKLKGMIGEA D156W_P184L_1189W_KAKYLISLARDEYNEPIRTRVRKSIGRTVIMKRNSRNLEEIKPY I248T_V289W_T290R_LFRAIEESYYKLDKRIPKAIHVVAWRSYWDYVHRLRRFPHGISK E291S_D292Y_L293W_ETAYSESVKLLQKILEEDERKIRRIGVRFSKF I295Y_S297H_G299L_T301R Δ341-352 113MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGREEDSGVVA CO534TANYEARKFGVYAGIPIVEAKKILPNAVYLPWRNLVYWGVSERI A42V_K56Y_A57P_M76W_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI K78N_E79L_Q82W_Q83G_LEKEKITVTVGISKNKVFAGIAMWMAKPNGIKVIDDEEVKRLIR S86E_K152G_A155M_ELDIADVLGIPYWYAEKLKKLGINKLVDTLSIEFDKLKGMIGEA D156W_P184L_G187P_KAKYLISLARDEYNEPIRTRVRKSIGRTVIMKRNSRNLEEIKPY N188Y_I189W_T190Y_I248T_LFRAIEESYYKLDKRIPKAIHVVAWKSYWDIVSRGRTFPHGISK V289W_T290K_E291S_ETAYSESVKLLQKILEEDERKIRRIGVRFSKF D292Y_L293WΔ341-352 114MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGREEDSGVVA C1050TANYEARKFGVYAGIPIVEAKKILPNAVYLPWRNLVYWGVSERI A42V_K56Y_A57P_M76W_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI K78N_E79L_Q82W_Q83G_LEKEKITVTVGISKNKVFAGIAMWMAKPNGIKVIDDEEVKRLIR S86E_K152G_A155M_ELDIADVLGIPYWYAEKLKKLGINKLVDTLSIEFDKLKGMIGEA D156W_P184L_G187P_KAKYLISLARDEYNEPIRTRVRKSIGRTVIMKRNSRNLEEIKPY N188Y_I189W_T190Y_I248T_LFRAIEESYYKLDKRIPKAIHVVAWRSYWDYVHRLRRFPHGISK V289W_T290R_E291S_ETAYSESVKLLQKILEEDERKIRRIGVRFSKF D292Y_L293W_I295Y_ S297H_G299L_T301RΔ341-352 115 MIVLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGREEDSGVVA C1051TANYEARKFGVYAGIPIVEAKKILPNAVYLPWRNLVYWGVSERI A42V_K56Y_A57P_M76W_MNLLREYSEKIEIASIDEAYLDISDKVRDYREAYNLGLEIKNKI K78N_E79L_Q82W_Q83G_LEKEKITVTVGISKNKVFALRAGLMAKPNGIKVIDDEEVKRLIR S86E_K152L_153R_A155G_ELDIADVLGIPYWYAEKLKKLGINKLVDTLSIEFDKLKGMIGEA D156L_P184L_G187P_KAKYLISLARDEYNEPIRTRVRKSIGRTVIMKRNSRNLEEIKPY N188Y_I189W_T190Y_I248T_LFRAIEESYYKLDKRIPKAIHVVAWRSYWDYVHRLRRFPHGISK V289W_T290R_E291S_ETAYSESVKLLQKILEEDERKIRRIGVRFSKF D292Y_L293W_I295Y_ S297H_G299L_T301RΔ341-352

The Examples and polymerase variants provided below further illustrateand exemplify the compositions of the present invention and methods ofpreparing and using such compositions. It is to be understood that thescope of the present invention is not limited in any way by the scope ofthe following Examples.

EXAMPLES Example 1 Identification of DPO4 as a Candidate Translesion DNAPolymerase for Incorporation of Bulky Nucleotide Analogs DuringTemplate-Mediated DNA Synthesis

To identify a DNA polymerase with the ability to synthesize daughterstrands using “bulky” substrates (i.e., able to bind and incorporateheavily substituted nucleotide analogs into a growing nucleic acidstrand), a screen was conducted of several commercially availablepolymerases. Candidate polymerases were assessed for the ability toextend an oligonucleotide-bound primer using a pool of dNTP analogssubstituted with alkyne linkers on both the backbone α-phosphate and thenucleobase moieties (model bulky substrates, referred to herein as,“dNTP-2c”). Polymerases screened for activity included the following:Vent_(R) (Exo-), Deep Vent_(R)® (Exo-), Therminator, Therminator II,Therminator III, Therminator Y, PWO, PWO SuperYield, PyroPhage 3173(Exo-), Bst, Large Fragment, Exo-Pfu, Platinum Genotype TSP, Hemo KlenTaq, Taq, MasterAMP Taq, Phi29, Bsu, Large Fragment, Exo-Minus Klenow(D355A, E357A), Sequenase Version 2.0, Transcriptor, Maxima,Thermoscript, M-MuLV (RNase H-), AMV, M-MuLV, Monsterscript, and DPO4.Of the polymerases tested, DPO4 (naturally expressed by the archaea,Sulfolobus solfataricus) was most able to effectively extend atemplate-bound primer with dNTP-2c nucleotide analogs. Without beingbound by theory, it was speculated that DPO4, and possibly other membersof the translesion DNA polymerase family (i.e., class Y DNApolymerases), may be able to effectively utilize bulky nucleotideanalogs owing to their relatively large substrate binding sites, whichhave evolved to accommodate naturally occurring, bulky DNA lesions.

Example 2 Identification of “Hot Spots” for Directed Mutagenesis in theDPO4 Protein and Screen of DPO4 Mutant Libraries to Identify OptimizedSequence Motifs

As a first step in generating DPO4 variants with improved polymeraseactivity when challenged with bulky substrates, the “HotSpot Wizard” webtool was used to identify amino acids in the DPO4 protein to target formutagenesis. This tool implements a protein engineering protocol thattargets evolutionarily variable amino acid positions located in, e.g.,the enzyme active site. “Hot spots” for mutation are selected throughthe integration of structural, functional, and evolutionary information(see, e.g., Pavelka et al., “HotSpot Wizard: a Web Server forIdentification of Hot Spots in Protein Engineering” (2009) Nuc Acids Res37 doi:10.1093/nar/gkp410). Applying this tool to the DPO4 protein, itwas observed that hot spot residues identified tended to cluster intocertain zones, or regions, spread throughout the full amino acidsequence. Arbitrary boundaries were set to distinguish 13 such regions,designated “Mut1”-“Mut13”, in which mutagenesis hot spots areconcentrated. These 13 “Mut” regions are illustrated in FIG. 1 with hotspot residues identified by underscoring.

To screen for DPO4 variants with improved polymerase activity based onhot spot mapping, a saturation mutagenesis library was created for eachof the 13 Mut regions, in which hot spot amino acids were changed, whileconserved amino acids were left unaltered. Screening was conducted usinga 96 well plate platform, and polymerase activity was assessed with aprimer extension assay using “dNTP-OAc” nucleotide analogs assubstrates. These model bulky substrates are substituted with triazoleacetate moieties conjugated to alkyne substituents on both theα-phosphate and the nucleobase moieties. Screening results identifiedtwo Mut regions in particular that consistently produced DPO4 mutantswith enhanced activity. These regions, “Mut_4” and “Mut_11”, correspondto amino acids 76-86 and amino acids 289-304, respectively, of the DPO4protein. Further analysis of high-performing Mut_4 and Mut_11 variantslead to the identification of an optimized variant motif sequence foreach region. The optimized Mut_4 motif identified herein is as follows:M76W, K78N, E79L, Q82W, Q83G, and S86E, while that for the Mut_11 regionis as follows: V289W, T290K, E291S, D292Y, and L293W.

Example 3 Mut_4 Library Screen and Identification of 45 DPO4 Variantswith Enhanced Abilities to Incorporate Bulky Nucleotide Analogs into aGrowing Daughter Strand

A further screen of the MUT_4 library was conducted in which 3,000unique variants were screened (representing 0.005% of the library), asdescribed above. This screen identified 45 unique variants as candidatepolymerases with enhanced capabilities to utilize bulky nucleotideanalogs as substrates. These variants are set forth in Table 2 andidentified by the prefix “SGM”. The activity of the variants was furtherassessed based on their abilities to incorporate the substrates,“2c-OAc” (as described above), “1 spermine” (a dNTP analog in which analkyl linker conjugated to the nucleobase is further conjugated with along spermine polymer), or “2 spermine” (in which a long sperminepolymer is further conjugated to an alkyl linker conjugated to the alphaphosphate of the “1 spermine” analog) in a primer extension assay. The“spermine” analogs are models for very bulky polymerase substrates, andare thus less efficiently incorporated in primer extension assaysrelative to the less bulky 2c-Oac analog. Extensions of primer,5′-WGAACCACTATACTCCTCGATG-3′ (SEQ ID NO: 116) (wherein “W” represents afluorophore, e.g. Sima Hex), annealed to 10mer homopolymer template,5′XGGGGGGGGGGCATCGAGGAGTATACTGGTTCβ-3′(SEQ ID NO: 117), were conductedin extension “buffer A” (10 mM Tris-OAc, pH 8.3, 100 mM NH₄OAc, pH 8.5,and 2 mM MnCl₂) for the 2c-OAc substrate (2.50 μM dCTP-OAc), or “bufferB” (20 mM Tris-OAc, pH 8.3, 200 mM NH₄OAc, pH 8.8, 20% DMSO, 0.06 μg/μLSSB, 3 mM chain polyphosphate, 25% PEG8000, 10 μM BSA, and 4 mM MnCl₂)for the “spermine” (2.50 μM dCTP-spermine) substrates. Reactions wererun for three hours at 55° C. and products were analyzed by gelelectrophoresis and fluorescent detection to determine the number ofsuccessful extension events of the template-bound primer. The activitiesof the Mut_4 SGM variants in these assays are set forth in Table 4below, with the activity of wildtype DPO4 shown in the last row. As canbe seen, all variants display extension activity with bulky substrates,with variant, “Mothra”, in particular, displaying notable extensionactivity with the highly bulky spermine substrates.

TABLE 4 Primer Extension Activities of DPO4 Polymerase Variants usingBulky Substrates Extensions Extensions Extensions DPO4 variant on 2c-OAcon 1 spermine on 2 spermine SGM-0001.6 9 3 SGM-0009.2 10 3 SGM-0010.02 83 SGM-0010.08 8 3 SGM-0010.17 8  3⁺ SGM-0010.22 8 3 SGM-0010.45 7 3SGM-0010.46 8 2 SGM-0010.62 7 3 SGM-0010.65 8 3 SGM-0010.72 7 3SGM-0010.101 8 2 SGM-0010.105 6 2 SGM-0010.115 7 3 SGM-0010.153 7 3SGM-0010.176 9 4 SGM-0023.29 7 3 SGM-0023.61 7 2 SGM-0023.75 7 3SGM-0025.47 7 3 SGM-0027.26 8 SGM-0027.35 8 4 SGM-0027.38 8 SGM-0027.458 5  6⁺ SGM-0027.64 9 5  5⁺ SGM-0029.25 9 3  6⁺ SGM-0029.45 10 3 4SGM-0029.87 10 3 SGM-0031.16 10 5  3⁺ SGM-31.16(W76) 6 SGM-0031.33 10 4 3⁺ SGM-31.33(W76) 5 SGM-0031.76 10  3⁺ SGM-0033.35 9 3 SGM-0033.61 10 3SGM-0034.67 10 3 SGM-0035.78 9 3 SGM-0036.69 9 3 SGM-0037.07 9 3SGM-0037.53 11 6 5 SGM-0037.65 11 4  6⁺ SGM-0038.06 10 3 5 SGM-0057.3710 7  5⁺ SGM-MOTHRA 10 8 10  SGM-71.85 11 6 WT DPO4 8 3 3

Example 3 Random Mutagenesis Screen for Improved DPO4 Variants UsingMut_4 Variant Backbone

In a parallel approach to generating DPO4 variants with improvedpolymerase activity when challenged with bulky substrates, thehigh-performing DPO4 variant, “MOTHRA”, was targeted for randommutagenesis. The MOTHRA backbone is a Mut_4 variant with the followingsequence motif: M76W_K78N_E79L_Q82W_Q83S_S86D. Saturation mutagenesiswas used to create a library in which single amino acids spanning theentire MOTHRA backbone were targeted for mutation. Screening of variantswas done in the 96 well plate format using a primer extension assay withdNTP-OAc substrates, as described above. Variants displaying thegreatest activity in this assay were purified for further analysis. Thepolymerase activity of each of the 63 purified variants was assessed ina primer extension assay using bulkier nucleotide analogs, termed “RTs”,which have longer hydrocarbon conjugates than dNTP-OAcs, as substrates.Assay results are set forth in Table 5; each variant was ranked ashaving improved (+), similar (−), or reduced (x) activity as compared tothe parental Mut_4 variant, MOTHRA.

TABLE 5 Extension Activities of MOTHRA Variants position variant μgmutation result 5 SGM-0134.65 0.6 F5Y x 42 SGM 86.5 0.6 A42V + 56SGM-0142.57 0.3 K56Y ++ 56 SGM-0142.91 0.3 K56H ++ 66 SGM 103.62 0.6K66R + 67 SGM 104.56 0.6 I67L − 87 SGM 95.32 0.6 R87S + 88 SGM 94.56 0.6I88T + 89 SGM93.5 0.6 M89W + 94 SGM-0097.4 0.4 E94D + 94 SGM-0097.08 0.4E94S + 100 SGM-0145.16 0.3 E100S − 100 SGM-0145.24 0.3 E100Q x 141SGM-0146.09 0.3 T141S x 153 SGM-0156.09 0.3 I153G + 153 SGM-0156.16 0.3I153F − 153 SGM-0156.27 0.3 I153W ++ 153 SGM-0156.65 0.3 I153Q ++ 155SGM-0153.02 0.4 A155L ++ 155 SGM-0153.19 0.4 A155M ++ 155 SGM-0153.260.3 A155L + 167 SGM-0104.57 0.6 I167R + 181 SGM-0115.65 0.6 A181M x 181SGM-0115.7 0.6 A181Y − 181 SGM-0115.89 0.6 A181T − 181 SGM-0115.96 0.6A181F x 181 SGM-0115.91 0.4 A181S − 184 SGM-0117.14 0.6 P184W + 184SGM-0117.47 0.6 P184Y + 184 SGM-0117.64 0.6 P184F + 184 SGM-0117.65 0.6P184N − 184 SGM-0117.90 0.6 P184L ++ 184 SGM-0117.07 0.6 P184L ++ 184SGM-0117.08 0.6 P184H − 184 SGM-0117.16 0.6 P184G + 184 SGM-0117.32 0.6P184Q − 184 SGM-0117.96 0.6 P184S − 188 SGM-0118.08 0.6 N188L + 188SGM-0118.24 0.6 N188D x 188 SGM-0118.02 0.6 N188G x 189 SGM-0128.77 0.6I189W ++ 189 SGM-0128.90 0.6 I189E − 189 SGM-0128.96 0.6 I189G − 240SGM-0154.12 0.16 R240T − 240 SGM-0154.95 0.08 R240S − 242 SGM-0125.090.6 R242I + 242 SGM-0125.73 0.6 R242M − 289 SGM-0119.09 0.6 V289F − 289SGM-0119.24 0.25 V289W + 293 SGM-0159.24 0.3 L293F ++ 293 SGM-0159.280.3 L293W + 293 SGM-0159.55 0.14 L293R + 293 SGM-0159.77 0.3 L293Y + 294SGM-0121.16 0.6 D294W + 298 SGM-0131.88 0.4 R298G − 298 SGM-0131.89 0.4R298N x 298 SGM-0131.92 0.4 R298Q − 298 SGM-0131.95 0.4 R298H x 300SGM-0122.08 0.4 R300T − 300 SGM-0122.16 0.4 R300E ++ 300 SGM-0122.17 0.4R300G − 300 SGM-0122.81 0.3 R300L x 328 SGM-0123.08 0.3 R328H −

Example 4 Semi-Rational Approach to Designing DPO4 Variants withEnhanced Polymerization Activity

To continue evolving DPO4 variants with improved utilization of bulkysubstrates, a “semi-rational” design approach was taken following anumber of different strategies. In one strategy, one or more of the hitsidentified in the random mutagenesis screen of the MOTHRA backbonevariant were combined with the Mut_4 or Mut_4 and Mut_11 optimizedsequence motifs described above. In another strategy, changes in otherMut regions (e.g., Mut_6 and/or Mut_7) were introduced into a Mut_4 or aMut_4 and Mut_11 backbone. The exemplary variants designed followingthese strategies are set forth in Table 3, with each variant assigned aunique identifier, from “PDC47” to “PDC107”. The activity of each of thePDC variants was assessed using a primer extension assay withsubstituted nucleotide analogs, as described above. Of the 29 PDCvariants generated and analyzed, one in particular emerged asconsistently demonstrating improved primer extension activity ascompared to parental variants. This variant, PDC79 (SEQ ID NO:78), isbased on the Mut_4 and Mut_11 motif background with the addition ofA42V, K152G, D156W, P184L, and I189W mutations.

Example 5 Deletion of C-Terminal PIP Box Domain of DPO4 Polymerase

To further optimize properties of high-performing DPO4 variants, theC-terminal “PIP box” domain was targeted for deletion. The PIP box,corresponding to amino acids 341-352 of the wildtype protein, normallyfunctions to, e.g., mediate interaction with PCNA. When not bound to aninteracting protein, however, the PIP box lacks structured form (see,e.g., Xing G. et al., (2009) “Structural Insight into Recruitment ofTranslesion DNA Polymerase Dpo4 to Sliding Clamp PCNA” Mol. Microbiol.71(3). 678-691). It was speculated that removal of this unstructuredregion might improve certain structural and/or functional properties ofthe DPO4 variants, and possibly other DNA polymerases and variants.Standard mutagenesis using the Q5 Polymerase mutagenesis kit(commercially available from NEB®) was used to delete the DNA sequenceencoding the PIP box from the cDNA encoding DPO4 variant, PDC79. Theplasmid encoding PDC79Δ341-352 (also referred to as PDC108, SEQ IDNO:106), fused with a C-terminal his tag, was transformed into T7Express lys cells (NEB®) and recombinant protein expression was inducedwith IPTG for 4 hours at 37° C. Cells were harvested and lysed andrecombinant protein was purified with Ni⁺⁺-coated beads using standardtechniques. Eluted protein was de-salted, resuspended in storage bufferand quantitated by gel densitometry. Surprisingly and advantageously,deletion of the PIP box was found to increase PDC79 protein yield byapproximately 3-fold, likely by improving the solubility of the proteinduring bacterial expression. Next, the PIP box domain was deleted fromseveral other candidate DPO4 variants. One in particular, CO345(PDC93Δ341-352, SEQ ID NO:109), showed considerable improvement in yieldand became a top candidate for further analysis and modification. Ofparticular interest was variant, CO416 (SEQ ID NO:110), in which themutations A115M and I248T were introduced into CO345.

Example 6 Further Optimization of Pip Box Deletion Mutants

During the course of screening various DPO4 libraries, one that wasconsistently observed to generate high functioning DPO4 variants targetsthe little finger domain of the polymerase protein, corresponding toamino acid 316, 295, 297, 299 and 301. Based on the crystal structure ofthe DPO4 protein, these residues are predicted to project into anaqueous channel occupied by the DNA helix. A mutant from this library,CO681 (SEQ ID NO:111) was found to function equal to or better thanCO416 in primer extension assays using bulky nucleotide analogsubstrates. As the mutation A57P was previously shown to markedlyincrease protein yield, this mutation was introduced into CO681 togenerate variant CO935 (SEQ ID NO:112). CO935, indeed, demonstrated animproved yield during bacterial expression.

Another library, L267, targeting residues 187-190, became of interest asthis region is predicted to contribute to a loop and helix in the thumbdomain of the DPO4 polymerase, at the closest point of contact with thebackbone of the DNA primer strand. Screening of this library identifieda new variant, CO534 (SEQ ID NO:113) that also displayed robust primerextension activity in the assays described herein. Two additionalvariants were designed that combined different features of the topcandidates, CO416 and CO534. Two resulting variants in particular, C1050(SEQ ID NO: 114) and CO1051 (SEQ ID NO:115) emerged as top candidatesfor further analysis and modification.

All of the U.S. patents, U.S. patent application publications, U.S.patent applications, foreign patents, foreign patent applications andnon-patent publications referred to in this specification and/or listedin the Application Data Sheet, including but not limited to U.S.Provisional Application Nos. 62/255,918 and 62/328,967, as well as U.S.Pat. No. 7,939,259 and PCT Publication No. WO 2016/081871, areincorporated herein by reference in their entirety. Such documents maybe incorporated by reference for the purpose of describing anddisclosing, for example, materials and methodologies described in thepublications, which might be used in connection with the presentlydescribed invention.

1. An isolated recombinant DNA polymerase, which recombinant DNApolymerase comprises an amino acid sequence that is at least 90%identical to SEQ ID NO:1, which recombinant polymerase comprises atleast one mutation at a position selected from the group consisting ofamino acids 76, 78, 79, 82, 83, and 86, wherein identification ofpositions is relative to wildtype DPO4 polymerase (SEQ ID NO:1), andwhich recombinant DNA polymerase exhibits polymerase activity.
 2. Thepolymerase of claim 1, wherein the mutation at position 76 is selectedfrom the group consisting of M76H, M76W, M76V, M76A, M76S, M76L, M76T,M76C, M76F, and M76Q.
 3. The polymerase of claim 1, wherein the mutationat position 78 is selected from the group consisting of K78P, K78N,K78Q, K78T, K78L, K78V, K78S, K78F, K78E, K78M, K78A, K78I, K78H, K78Y,and K78G.
 4. The polymerase of claim 1, wherein the mutation at position79 is selected from the group consisting of E79L, E79M, E79W, E79V,E79N, E79Y, E79G, E79S, E79H, E79A, E79R, E79T, E79P, E79D, and E79F. 5.The polymerase of claim 1, wherein the mutation at position 82 isselected from the group consisting of Q82Y, Q82W, Q82N, Q82S, Q82H,Q82D, Q82E, Q82G, Q82M, Q82R, Q82K, Q82V, and Q82T.
 6. The polymerase ofclaim 1, wherein the mutation at position 83 is selected from the groupconsisting of Q83G, Q83R, Q83S, Q83T, Q83I, Q83E, Q83M, Q83D, Q83A,Q83K, and Q83H.
 7. The polymerase of claim 1, wherein the mutation atposition 86 is selected from the group consisting of S86E, S86L, S86W,S86K, S86N, S86Q, S86V, S86M, S86T, S86G, S86R, S86A, and S86D.
 8. Thepolymerase of claim 1, comprising the amino acid sequence as set forthin any one of SEQ ID NOs: 2-46.
 9. A composition comprising arecombinant DNA polymerase as set forth in claim
 1. 10. The compositionof claim 9, wherein the composition is present in a DNA sequencingsystem that comprises at least one non-natural nucleotide analogsubstrate.
 11. A modified nucleic acid encoding a modified DPO4-type DNApolymerase as set forth in claim
 1. 12. An isolated recombinant DNApolymerase, wherein the recombinant DNA polymerase comprises an aminoacid sequence that is at least 90% identical to SEQ ID NO:1, wherein therecombinant polymerase comprises mutations at positions 76, 78, 79, 82,83, and 86 and at least one mutation at a position selected from thegroup consisting of 5, 42, 56, 57, 62, 66, 141, 150, 152, 153, 155, 156,184, 187, 188, 189, 190, 212, 214, 215, 217, 221, 226, 240, 241, 248,289, 290, 291, 292, 293, 295, 297, 299, 300, 301, and 326, whereinidentification of positions is relative to wildtype DPO4 polymerase (SEQID NO:1), and wherein the recombinant DNA polymerase exhibits polymeraseactivity.
 13. The polymerase of claim 12, wherein the mutations atpositions 76, 78, 79, 82, 83, and 86 are M76W, K78N, E79L, Q82W, Q83G,and S86E.
 14. The polymerase of claim 12, wherein at least one of thefollowing is satisfied: the mutation at position 5 is F5Y the mutationat position 42 is A42V; the mutation at position 56 is K56Y; themutation at position 57 is A57P; the mutation at position 62 is V62R;the mutation at position 66 is K66R; the mutation at position 141 isT141S; the mutation at position 150 is F150L; the mutation at position152 is K152A, K152G, K152M, or K152P; the mutation at position 153 isI153F, I153Q or I153W; the mutation at position 155 is A155L, A155M,A155N, A155V, or A155G; the mutation at position 156 is D156Y or D156W;the mutation at position 184 is P184L; the mutation at position 187 isG187W, G187D, G187P, or G187E; the mutation at position 188 is N188Y;the mutation at position 189 is I189W; the mutation at position 190 isT190Y, T190D, or T190E; the mutation at position 212 is K212V, K212L, orK212A; the mutation at position 214 is K214S; the mutation at position215 is G215F; the mutation at position 217 is I217V; the mutation atposition 221 is K221D, K221E, or K221Q; the mutation at position 226 isI226F; the mutation at position 240 is R240S or R240T; the mutation atposition 241 is V241N or V241R; the mutation at position 248 is I248A orI248T; the mutation at position 289 is V289W; the mutation at position290 is T290K or T290R; the mutation at position 291 is E291S; themutation at position 292 is D292Y; the mutation at position 293 is L293For L293W; the mutation at position 295 is I295Y; the mutation atposition 297 is S297H; the mutation at position 299 is G299L; themutation at position 300 is R300E or R300V; the mutation at position 301is T301R; or the mutation at position 326 is D326E. 15-50. (canceled)51. The polymerase of claim 12, comprising the amino acid sequence asset forth in any one of SEQ ID NOs: 47-115.
 52. A composition comprisinga recombinant DNA polymerase as set forth in claim
 12. 53. Thecomposition of claim 52, wherein the composition is present in a DNAsequencing system that comprises at least one non-natural nucleotideanalog substrate.
 54. A modified nucleic acid encoding a modifiedDPO4-type DNA polymerase as set forth in claim
 12. 55. An isolatedrecombinant DNA polymerase, wherein the recombinant DNA polymerase iscapable of synthesizing nucleic acid daughter strands using nucleotideanalog substrates having the following structure:

wherein T represents a tether; N represents a nucleobase residue; Vrepresents an internal cleavage site of the nucleobase residue; and R¹and R² represent the same or different end groups for the templatedirected synthesis of the daughter strand.
 56. The isolated recombinantDNA polymerase of claim 55, wherein the recombinant DNA polymerase is aclass Y DNA polymerase, or a variant thereof.
 57. The isolatedrecombinant DNA polymerase of claim 55, wherein the recombinant DNApolymerase is DPO4 or Dbh, or a variant thereof.
 58. The isolatedrecombinant DNA polymerase of claim 55, wherein the recombinant DNApolymerase is DPO4 (SEQ ID NO:1), or a variant thereof.
 59. The isolatedrecombinant DNA polymerase of claim 55, comprising a deletion to removethe PIP box region of the protein.
 60. The isolated recombinant DNApolymerase of claim 59, wherein the deletion comprises the terminal 12amino acids of the protein.
 61. A composition comprising a recombinantDNA polymerase as set forth in claim
 55. 62. The composition of claim61, wherein the composition is present in a DNA sequencing system thatcomprises at least one non-natural nucleotide analog substrate.
 63. Amodified nucleic acid encoding a modified DPO4-type DNA polymerase asset forth in claim 55.